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TRANSGENIC PLANTS EXPRESSING PHOTORHABDUS TOXIN 

BACKGROUND OF THE INVENTION 
As reported in WO98/08932, protein toxins from the 
genus Photorhabdus have been shown to have oral toxicity 
5 against insects. The toxin complex produced by 

Photorhabdus luminescens (W-14), for example, has been 
shown to contain ten to fourteen proteins, and it is 
known that these are produced by expression of genes from 
four distinct genomic regions: tea, tcb, tec, and ted. 
10 WO98/08932 discloses nucleotide sequences for the native 
toxin genes. 

Of the separate toxins isolated from Photorhabdus 
luminescens (W-14), those designated Toxin A and Toxin B 
are especially potent against target insect species of 

15 interest, for example corn rootworm. Toxin A is 

comprised of two different subunits. The native gene 
tcdA (SEQ ID NO:l) encodes protoxin TcdA (see SEQ ID 
NO:l). As determined by mass spectrometry, TcdA is 
processed by one or more proteases to provide Toxin A. 

20 More specifically, TcdA is an approximately 282.9 kDA 

protein (2516 aa) that is processed to provide TcdAii, an 
approximately 208.2 kDA (1849 aa) protein encoded by 
nucleotides 265-5811 of SEQ ID NO:l, and TcdAiii, an 
approximately 63.5 kDA (579 aa) protein encoded by 

25 nucleotides 5812-7551 of SEQ ID NO:l. 

Toxin B is similarly comprised of two different 
subunits. The native gene tcbA (SEQ ID NO: 2) encodes 
protoxin TcbA (see SEQ ID N0:2). As determined by mass 
spectrometry, TcbA is processed by one or more proteases 

30 to provide Toxin B. More specifically, TcbA is an 
approximately 280.6 kDA (2504 aa) protein that is 
processed to provide TcbAii, an approximately 207.7 kDA 
(1844 aa) protein encoded by nucleotides 262-5793 of SEQ 
ID NO: 2 and TcbAiii, an approximately 62.9 kDA (573 aa) 

35 protein encoded by nucleotides 5794-7512 of SEQ ID NO: 2. 
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The native tcdA and tcbA genes are not well suited 
for high level expression in plants. They encode 
multiple destabilization sequences, mRNA splice sites, 
polyA addition sites and other possibly detrimental 
5 sequence motifs. In addition, the codon compositions are 
not like those of plant genes. WO98/08932 gives general 
guidance on how the toxin genes could be reengineered to 
more efficiently expressed in the cytoplasm of plants, 
and describes how plants can be transformed to 
10 incorporate the Photorhabdus toxin genes into their 
genomes . 

SUMMARY OF THE INVENTION 
In a preferred embodiment, the invention provides 
novel polynucleotide sequences that encode TcdA and TcbA. 

15 The novel sequences have base compositions that differ 
substantially from the native genes, making them more 
similar to plant genes. The new sequences are suitable 
for use for high expression in both monocots and dicots, 
and this feature is designated by referring to the 

20 sequences as the "hemicot" criteria, which is set forth 
in detail hereinafter. Other important features of the 
sequences are that potentially deleterious sequences have 
been eliminated, and unique restriction sites have been 
built in to enable adding or changing expression 

25 elements, organellar targeting signals, engineered 
protease sites and the like, if desired. 

In a particularly preferred embodiment, the 
invention provides polynucleotide sequences that satisfy 
hemicot criteria and that comprise a sequence encoding an 

30 endoplasmic reticulum signal or similar targeting 

sequence for a cellular organelle in combination with a 
sequence encoding TcdA or TdbA. 

More broadly, the invention provides engineered 
nucleic acids encoding functional Photorhabdus toxins 

35 wherein the sequences satisfy hemicot criteria. 
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The invention also provides transgenic plants with 
genomes comprising a novel sequence of the invention that 
imparts functional activity against insects. 

5 BRIEF DESCRIPTION OF SEQUENCES 

SEQ ID NO:l is the native tcdA DNA sequence together 
with the corresponding encoded amino acid sequence for 
TcdA. 

SEQ ID NO: 2 is the native tcbA DNA sequence together 
10 with the corresponding encoded amino acid sequence for 
TcbA, 

SEQ ID NO: 3 is an artificial sequence encoding TcdA 
that is suitable for expression in monocot and dicot 
. plants. 

15 SEQ ID NO: 4 is an artificial sequence encoding TdbA 

that is suitable for expression in monocot and dicot 
plants. 

SEQ ID NO: 5 is an artificial hemicot sequence that 
encodes the 21 amino acid ER signal peptide of 15 kDa 

20 zein from Black Mexican Sweet maize. 

SEQ ID NO: 6 is an artificial hemicot sequence that 
encodes for the full-length native TcdA protein (amino 
acids 22-2537) fused to the modified 15 kDa zein 
endoplasmic reticulum signal peptide (amino acids 1-21) . 

25 DETAILED DESCRIPTION 

The native Photorhabdus toxins are protein complexes 
that are produced and secreted by growing bacteria cells 
of the genus Photorhabdus. Of particular interest are 
the proteins produced by the species Photorhabdus 

30 luminescens. The protein complexes have a molecular size 
of approximately 1,000 kDa and can be separated by SDS- 
PAGE gel analysis into numerous component proteins. The 
toxins contain no hemolysin, lipase, type C 
phospholipase, or nuclease activities. The toxins 

35 exhibit significant toxicity upon ingestion by a number 
of insects. 
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A unique feature of Photorhabdus is its 
bioluminescence. Photorhabdus may be isolated from a 
variety of sources. One such source is nematodes, more 
particularly nematodes of the genus Heterorhabditis . 
5 Another such source is from human clinical samples from 
wounds, see Farmer et al. 1989 J. Clin. Microbiol. 27 pp. 
1594-1600. These saprohytic strains are deposited in the 
American Type Culture Collection (Rockville, MD) ATCC #s 
43948, 43949, 43950, 43951, and 43952, and are 

10 incorporated herein by reference. It is possible that 
other sources could harbor Photorhabdus bacteria that 
produce insecticidal toxins. Such sources in the 
environment could be either terrestrial or aquatic based. 
The genus Photorhabdus is taxonomically defined as a 

15 member of the Family Enterobacteriaceae, although it has 
certain traits atypical of this family. For example, 
strains of this genus are nitrate reduction negative, 
yellow and red pigment producing and bioluminescent . 
This latter trait is otherwise unknown within the 

20 £nterobacteriaceae. Photorhabdus has only recently been 
described as a genus separate from the Xenorhabdus 
(Boemare et al., 1993 Int. J. Syst. Bacteriol. 43, 249- 
255) . This differentiation is based on DNA-DNA 
hybridization studies, phenotypic differences (e.g., 

25 presence (Photorhabdus) or absence (Xenorhabdus) of 
catalase and bioluminescence) and the Family of the 
nematode host (Xenorhabdus; Steinernematidae, 
Photorhabdus; Heterorhabditidae) . Comparative, cellular 
fatty-acid analyses (Janse et al. 1990, Lett. Appl. 

30 Microbiol 10, 131-135; Suzuki et al. 1990, J. Gen. Appl. 
Microbiol., 36, 393-401) support the separation of 
Photorhabdus from Xenorhabdus. 

Currently, the bacterial genus Photorhabdus is 
comprised of a single defined species, Photorhabdus 

35 luminescens (ATCC Type strain #29999, Poinar et al., 
1977, Nematologica 23, 97-102) . A variety of related 
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strains have been described in the literature (e.g., 
Akhurst et al. 1988 J. Gen. Microbiol., 134, 1835-1845; 
Boemare et al. 1993 Int. J. Syst. Bacterid. 43 pp. 249- 
255; Putz et al. 1990, Appl. Environ. Microbiol., 56, 
5 181-186). 

The following toxin producing Photorhabdus strains 
have been deposited: 
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strain 


accession number 


date of deposit 


W-14 


ATCC 55397 


March 5, 1993 


WX1 1 


NRRL B-21710 


April 29, 1997 


WX2 


NRRL B-21711 


April 29, 1997 


WX3 J 


NRRL B-21712 


April 29, 1997 


WX4 | 


NRRL B-21713 


April 29, 1997 


WX5 


NRRL B-21714 


April 29, 1997 


WX6 


NRRL B-21715 


April 29, 1997 


WX7 


NRRL B-21716 


April 29, 1997 


WX8 


NRRL B-21717 


April 29, 1997 


WX9 


NRRL B-21718 


April 29, 1997 


WX10 


NRRL B-21719 


April 29, 1997 


WX11 


NRRL B-21720 


April 29, 1997 


WX12 


NRRL B-21721 


April 29, 1997 


WX14 


NRRL B-21722 


April 29, 1997 


WX15 


NRRL B-21723 


April 29, 1997 j 


H9 


NRRL B-21727 


April 29, 1997 


Hb 


NRRL B-21726 


April 29, 1997 


Hm 


NRRL B-21725 


April 29, 1997 


HP88 


NRRL B-21724 


April 29, 1997 


NC-1 


NRRL B-21728 


April 29, 1997 


W30 


NRRL B-21729 


April 29, 1997 


WIR 


NRRL B-21730 


April 29, 1997 


B2 


NRRL B-21731 


April 29, 1997 


ATCC 4394 8 


ATCC 55878 


November 5, 1996 




ATCC 55879 


November 5, 1996 


ATCC 4 39S0 


ATCC 55880 


November 5, 1996 


ATCC 


ATCC 55881 


November 5, 1996 


ATPC 4 39S2 


ATCC 55882 


November 5, 1996 


nrpT 


NRRL B-21707 


April 29, 1997 


DEP2 


NRRL B-21708 


April 29, 1997 


DEP3 


NRRL B-21709 


April 29, 1997 


P zealandrica 


NRRL B-21683 


April 29, 1997 


P. heDialus 


NRRL B-21684 


April 29, 1997 


HB-Arg 


NRRL B-21685 


April 29, 1997 | 


HB Osweao 


NRRL B-21686 


April 29, 1997 


Hb Lewiston 


NRRL B-21687 


April 29, 1997 


K-122 


NRRL B-21688 


April 29, 1997 


HMGD 


NRRL B-21689 


April 29, 1997 


Tndicus 


NRRL B-21690 


April 29, 1997 


GD 


NRRL B-21691 


April 29, 1997 


PWH-5 


NRRL B-21692 


April 29, 1997 




NRRL B-21693 


April 29, 1997 


HF-85 


NRRL B-21694 


April 29, 1997 


A Cows 


NRRL B-21695 


April 29, 1997 


MP1 


NRRL B-21696 


April 29, 1997 


MP2 


NRRL B-21697 


April 29, 1997 


MP3 


NRRL B-21698 


April 29, 1997 


MP 4 


NRRL B-21699 


April 29, 1997 


MPS 


NRRL B-21700 


April 29, 1997 


GL98 


NRRL B-21701 


April 29, 1997 


G1101 


NRRL B-21702 


April 29, 1997 


GL138 


NRRL B-21703 


April 29, 1997 


GL155 


NRRL B-21704 


April 29, 1997 


GL217 


NRRL B-21705 


April 29, 1997 


GL257 


NRRL B-21706 


April 29, 1997 



All strains were deposited in accordance with the 



terms of the Budapest Treaty. Strains having 
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accession numbers prefaced by "ATTC" were deposited 
on the indicated date in the American Type Culture 
Collection, 12301 Parklawn Drive, Rockville, MD 
20852 USA. Strains prefaced by w NRRL" were 
5 deposited on the indicated date in the Agricultural 
Research Service Patent Culture Collection (NRRL) , 
National Center for Agricultural Utilization 
Research, ARS-USDA, 1815 North University St., 
Peoria IL 61604 USA. 

10 The present invention provides hemicot nucleic acid 

sequences encoding toxins from any Photorhabdus species 
or strain that produces a toxin having functional 
activity. Hemicot nucleic acid sequences encoding 
proteins homologous to such toxins are also encompassed 

15 by the invention. 

Several terms that are used herein have a particular 
meaning and are defined as follows: 

By "functional activity" it is meant herein that the 
protein toxins) function as insect control agents in that 

20 the proteins are orally active, or have a toxic effect, 

or are able to disrupt or deter feeding, which may or may 
not cause death of the insect. When an insect comes into 
contact with an effective amount of toxin delivered via 
transgenic plant expression, formulated protein 

25 compositions), sprayable protein compositions), a bait 
matrix or other delivery system, the results are 
typically death of the insect, or the insects do not feed 
upon the source which makes the toxins available to the 
insects. 

30 By "homolog" it is meant an amino acid sequence that 

is identified as possessing homology to a reference 
Photorhabdus toxin polypeptide amino acid sequence. 

By w homology" it is meant an amino acid sequence 
that has a similarity index of at least 33% and/or an 
35 identity index of at least 26% to a reference 

Photorhabdus toxin polypeptide amino acid sequence, as 
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scored by the GAP algorithm using the BlOsum 62 protein 
scoring matrix Wisconsin Package Version 9.0, Genetics 
Computer Group GCG) , Madison, WI) . 

By " identity" is meant an amino acid sequence that 
5 contains an identical residue at a given position, 

following alignment with a reference Photrhabdus toxin 
polypeptide amino acid sequence by the GAP algorithm. 

By the use of the term " Photorhabdus toxin" it is 
meant any protein produced by a Photorhabdus 
10 microorganism strain which has functional activity 

against insects, where the Photorhabdus toxin could be 
formulated as a sprayable composition, expressed by a 
transgenic plant, formulated as a bait matrix, delivered 
via baculovirus, or delivered by any other applicable 
15 host or delivery system. 

By the use of the term "toxic" or "toxicity" as used 
herein it is meant that the toxins produced by 
Photorhabdus have "functional activity" as defined 
herein. 

20 By "substantial sequence homology" is meant either: 

a DNA fragment having a nucleotide sequence sufficiently 
similar to another DNA fragment to produce a protein 
having similar biochemical properties; or a polypeptide 
having an amino acid sequence sufficiently similar to 

25 another polypeptide to exhibit similar biochemical 
properties . 

As with other bacterial toxins, the rate of mutation 
of the bacteria in a population causes many related 
toxins slightly different in sequence to exist. Toxins 

30 of interest here are those which produce protein 

complexes toxic to a variety of insects upon exposure, as 
described herein. Preferably, the toxins are active 
against Lepidoptera, Coleoptera, Homopotera, Diptera, 
Hymenoptera, Dictyoptera and Acarina. The inventions 

35 herein are intended to capture the protein toxins 

homologous to protein toxins produced by the strains 
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change in nomenclature alter the scope of the inventions 
described herein. 

The peptides and genes that are disclosed herein are 
named according to the guidelines recently published in 
5 the Journal of Bacteriology "Instructions to Authors" p. 
i-xii Jan. 1996) , which is incorporated herein by 
reference. 

Transformation methods useful in carrying out the 
invention are well known, and are described, for example, 
10 in WO98/08932. 

Hemicot tcdA and tcbA 

SEQ ID NO: 3 is the nucleotide sequence for an 
engineered tcdA gene in accordance with the invention. 
SEQ ID NO: 4 is the nucleotide sequence for an engineered 
15 tcbA gene in accordance with the invention. 

The following Tables 1 and 2 identify significant 
features of the engineered tcdA and tcbA genes. 



Table 1 
tcdA 



20 



Feature 


nucleotides of SEQ ID NO: 3 


Ncol 


1-6 


Hindi 1 1 


48-53 


Kpnl 


246-254 


sequence encoding 
TcbAii 


267-5798 


Nhel 


333-338 


Bglll 


1215-1220 1 


Clal 


2604-2609 


Pstl 


4015-4020 




5088-5093 


Muni 


5598-5603 


Xbal j 


5778-5783 


sequence encoding 
TcbAiii 


5799-7517 


A/Ill 


5853-5858 j 


SphI 


6439-6444 


Sfvl 


7392-7397 


Sad 


7519-7524 


Xhol 


7522-7527 


StuI 


7528-7533 


Not I 


7533-7538 j 


Table 2 
tcbA 


Feature 


nucleotides of SEQ ID NO: 5 


Ncol 


1-6 


Hindlll 


48-53 


-10- 



WO 01/11029 



PCT/US00/22237 



Kpnl 


246-251 


sequence encoding 
TcbAii 


267-5798 


Nhel 


333-338 


Bglll 


1215-1220 


Clal 


2604-2609 


PstI 


4015-4020 


Aprel 


5088-5093 1 


Muni 


5598-5603 __j 


Xbal 


5778-5783 1 


sequence 
encodingTcbAiii 


5799-7517 


Aflll 


5853-5858 


Sfhl "1 


6439-6444 U 


Sful 


7392-7397 


SacI 


7519-7524 


Sful 


7392-7397 


SacI 


7519-7524 


Xhol 


7522-7527 


StuI 


7528-7533 


Notl 


7535-7540 



It should be noted that the proteins encoded by the 
plant-optimized tcdA (SEQ ID NO: 3) and tcbA (SEQ ID 
NO: 5) differ from the native proteins by the addition of 
5 an Ala residue at position #2. This modification was 
made to accommodate the Ncol site which spans the ATG 
start codon. 

The following Table 3 compares the codon composition of 
10 the engineered tcdA gene of SEQ ID NO: 3 and engineered 
tcbA gene of SEQ ID NO: 5 with the codon compositions of 
the native genes, the typical dicot genes, and maize 
genes. 

Table 3 



amino 


codon 


% in 


% in 


% in 


% in 


% in 


% in 


acid 


SEQ 


tcdA 


SEQ 


tcbA 


dicot 


maize 






ID 




ID 












NO:3 




NO: 5 








Ala 


GCT 


62 


21 


69 


41 


42 


24 


GCC 


26 


32 


27 


17 


27 


34 




GCA 


11 


25 


4 


22 


25 


18 




GCG 


0 


21 


0 


21 


6 


24 


Arg 


AGG 


48 


0 


60 


2 


25 


26 


CGC 


22 


36 


18 


16 


11 


24 




AGA 


20 


11 


15 


6 


30 


15 




CGT 


11 


39 


7 


57 


21 


11 




CGG 


0 


7 


0 


13 


4 


15 j 




CGA 


0 


8 


0 


6 


8 


9 


Asn 


AAC 


100 


32 


100 


33 


55 


68 




AAT 


0 


68 


0 


67 


45 


32 


Asp 


GAC 


67 


22 


70 


25 


42 


63 
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amino 


codon 


% in 


% in 


% in 


% in 


% in 


% in 


acid 




SEQ 


tcdA 


SEQ 


tcbA 


dicot 


maize 






ID 




ID 












NO: 3 




NO: 5 














to 




/ D 


jo 


•3 * 


Cys 


TGC 


100 


"3 A 

30 


1 AA 

100 


1 A 

19 


c c 


68 


rp/— rp 


a 
U 


/ U 


A 

U 


Q 1 
o i 


A A 




End 


TGA 


100 


A 

0 


n aa 

100 


A 
0 


J J 


C A 

59 




TAG 


0 


0 


0 


0 


1 A 

19 


21 




TAA 


0 


100 


0 


100 


48 


20 


Gin 


CAA 


65 


61 


74 


53 


C A 

59 


38 




CAG 


35 


39 


26 


47 


41 


62 


Glu 


GAG 


100 


24 


98 


36 


51 


71 




GAA 


0 


76 


2 


64 


49 


29 


Gly 


GGT 


67 


37 


64 


44 


33 


20 




GGC 


32 


36 


36 


22 


16 


42 




GGA 


1 


20 


0 


19 


38 


19 




GGG 


0 


8 


0 


16 


12 


20 


His 


CAC 


62 


40 


72 


31 


4 6 


62 




CAT 


38 


60 


28 


69 


54 


38 


lie 


ATC 


73 


34 


65 


24 


37 


58 




ATT 


27 


51 


35 


59 


45 


28 




ATA 


0 


15 


0 


17 


18 


14 


Leu 


CTC 


54 


11 


59 


7 


28 


26 




TTG 


29 


17 


25 


32 


26 


15 




CTT 


16 


9 


15 


7 


19 


17 




TTA 


0 


18 


0 


19 


1 A 

10 


5 




CTG 


0 


32 


0 


29 


9 


A A 

29 




CTA 


0 


13 


0 


7 


8 


8 


Lys 


AAG 


99 


79 


99 


75 


61 


78 




AAA 


1 


21 


1 


25 


1 A 

39 


A O 

22 


Met 


ATG 


100 


100 


100 


100 


100 


100 


Phe 


TTC 


100 


42 


100 


41 


55 


71 




TTT 


0 


58 


0 


59 


45 


29 


Pro 


CCA 


74 


30 


91 


26 


42 


26 




CCT 


22 


28 


7 


20 


32 


22 




CCC 


4 


14 


3 


7 


17 


24 




CCG 


0 


27 


0 


47 


9 


28 


Ser 


TCC 


47 


19 


55 


11 


18 


23 




TCT 


35 


15 


30 


15 


25 


15 




AGC 


18 


22 


15 


18 


18 


23 




AGT 


0 


20 


0 


31 


14 


9 




TCG 


0 


7 


0 


8 


6 


14 




TCA 


0 


17 


0 


17 


19 


16 


Thr 


ACC 


60 


41 


64 


31 


30 


37 




ACT 


28 


25 


32 


34 


35 


20 




ACA 


12 


21 


4 


18 


27 


21 






A 
U 


1 


A 
U 


1 Q 


Q 
0 




Trp 


TGG 


100 


100 


100 


100 


100 


100 


Tyr 


TAC 


100 


24 


100 


19 


57 


73 


TAT 


0 


76 


0 


81 


43 


27 


Val 


GTC 


69 


27 


73 


11 


20 


31 1 




GTG 


21 


17 


22 


27 


29 


39 




GTT 


10 


34 


3 


48 


39 


21 




GTA 


0 


22 


2 


14 


12 


8 



EXAMPLE 1 

Design Of Plant Codon-Biased Genes Encoding W-14 Peptides 
5 TcbA and TcdA 

A. Gene Design 
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The coding strands of the native DNA sequences of the 
Photorhabdus W-14 genes encoding peptides TcbA and TcdA 
were scanned for the presence of deleterious sequences 
such as the Shaw/Kamen RNA destabilizing motif ATTTA, 
5 intron splice recognition sites, and poly A addition 
motifs. This was done using the MacVector Sequence 
Analysis Software (Oxford Molecular Biology Group, 
Symantec Corp,)/ using a custom Nucleic Acid Subsequence 
File. The native sequence was also searched for runs of 

10 4 or more of the same base. 

Motif searching of the native W-14 tcbA and tcdA 
genes revealed the presence of many potentially 
deleterious sequences in the protein coding strands, as 
summarized in Table 4. Not shown, but also present, were 

15 many runs of four or more single residues (e.c[. the 
native tcbA gene has 81 runs of four A's). 



Table 4 



Native 
Gene 


ATTTA 


5' Splice 


3' Splice 


Poly A 
Addition* 


RNAP II terra. 


tcbA 


18 


7 


17 


46 


0 


tcdA 


18 


7 


13 


77 


1 


* Totals of 16 different motii 


:s. 



Analyses of eukaryotic genes and plant genes in 
20 particular have shown that CG & TA doublets are 

underrepresented, while the genes are enriched in CT & TG 
doublets. The sequences of the hemicot biased genes have 
accordingly been adjusted to encompass these base 
compositions and to have G+C compositions of about 53%, 
25 similar to many plant genes. When compared to the native 
W-14 tcbA and tcdA genes, the plant-biased genes have a 
much more uniform G+C distribution. 

Nucleotide changes to remove potentially deleterious 
sequences were chosen to simultaneously adjust the codon 
30 composition of the coding region to more closely reflect 
that of plant genes. A framework for these changes was 
provided by the codon bias tables prepared for maize and 
dicot genes shown in Table 3. 
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Comparison of codon compositions of the native W-14 
genes to maize and dicot genes revealed that the 
genes contain a very different preference set of the 
degenerate codons for the 18 amino acids for which there 
5 is a choice (Table 3) . For each of 8 amino acids (Phe, 
Tyr, Cys, Arg, Asn, Lys, Glu f and Gly) in both W-14 
genes, the most abundant codon is different from the 
preferred codons found in either maize or dicot genes. 
One might 1 expect that translational difficulties would be 
10 encountered in efforts to produce in plants proteins 

(such as TcbA and TcdA) having high relative amounts of 
these amino acids from mRNAs having large numbers of 
nonpref erred codons. There is a marked difference in 
distribution of the codon compositions specifying the 
15 other 10 amino acids. For His, Gin, lie, Val, and Asp, 
the dicot-preferred codons are found as the most abundant 
ones in both W-14 genes. For Leu, Thr, Ser, and Ala, the 
maize preferred codons are the most abundant codon 
choices found in the tcdA gene. In contrast, the tcJbA 
gene contains only the CCG (Pro) maize-preferred codon as 
the highest abundance choice. 

In making the codon choices, doublet contents were 
considered, so that adjacent codons preferably did not 
form CG or TA doublets (which are underrepresented in 
eukaryotic genes; 1, 4), while CT or TG doublets (which 
are enriched in eukaryotic genes ibid.) were created when 
possible. 

Choices were also made to utilize a diversity of 
codons for Met, Trp, Asn, Asp, Cys, Glu, His, lie, Lys, 
Phe, Thr, and Tyr. 

The sequences were also designed to encode unique 6- 
bp recognition sites for restriction enzymes, spaced 
about every 1200 bp. Finally, an additional codon (GCT; 
Ala) was inserted at the second position to encode an Nco 
I recognition site encompassing the ATG (Met) start 
codon. Additional recognition sites were included after 
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the stop codon to facilitate subsequent cloning steps 
into expression vectors. These features are set forth 
above in Tables 1 and 2. 

The new tcdA and tdbA genes of SEQ ID NO: 3 and SEQ 
5 ID NO: 4 share 73. 5%, and 72.6%% identity, respectively, to 
their native W-14 counterparts (Wisconsin Genetics 
Computer Group, GAP algorithm) . 
B. Gene Synthesis 

The complete synthesis of the plant codon-biased 

10 tdbA and tcdA genes was performed under contract by 
Operon Technologies, Inc. (OPTI, Alameda, CA) . 
Basically, chemically synthesized oligonucleotides of 
appropriate sequence were assembled into DNA pieces about 
500 bases long. These were joined together end-to-end 

15 (presumably by means of appropriately placed restriction 
enzyme sites) into four larger pieces of roughly 2 
kilobase pairs (kbp) each; therefore each comprised about 
1/4 of the entire coding region of the particular gene. 
DNA sequence of the pieces was confirmed at this step. 

20 If mistakes in sequence were present, the appropriate 
oligonucleotides were re-synthesized, and the assembly 
process was repeated. Once gene fractional parts were 
sequence verified, they were assembled in pairs to make 
the gene halves, and again sequence verified. Finally, 

25 the two halves were joined, and the sequences of the 
junctions between the halves was verified. Therefore, 
each part of the new gene was sequence verified at least 
twice. 

It should be noted that attempts to express the 
30 native tcbA or tcdA genes in standard Escherichia coli 
cloning strains suggests that production of these 
proteins is lethal. Lethality problems may be 
encountered if standard cloning vectors having leaky 
expression from inherent lacZ promoters are used to 
35 assemble these genes. 
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C. Addition Of Endoplasmic Reticulum Targeting Peptide To 
Tcda Coding Region 

It is known to those in the field of plant gene 
expression that proteins are specifically directed into 
5 the endoplasmic reticulum (ER) by means of a short signal 
peptide which is removed during or after the transport 
process through the ER membrane* The mature (processed) 
protein is incorporated into the ER endomembrane or is 
released into the ER lumen where the transported protein 

10 may be uniquely folded (aided by chaperonins) , modified 
by glycosylation, accumulated in the vacuole , or 
additionally translocated (by secretion) . These 
processes are reviewed by Gomord and Faye [V, Gomord and 
L. Faye, (1996) Signals and mechanisms involved in 

15 intracellular transport of secreted proteins in plants. 
Plant Physiol. Biochem. 34:165-181] and by Bar-Peled et 
al. [M. Bar-Peled, D. C. Bassham, and N„ V. Raikhel, 
(1996) Transport of proteins in evkaryotic cells: more 
questions ahead. Plant Molec. Biology 32:223-249]. It is 

20 also known that the subcellular recognition mechanisms 
for an ER signal peptide are evolutionarily somewhat 
conserved, since the ER signal for a protein normally 
produced in monocot (maize) cells is recognized and 
processed normally by dicot (tobacco) cells. This is 

25 exemplified by the maize 15 kDa zein ER signal peptide 
[L. M. Hoffman, D. D. Donaldson, R. Bookland, K. Rashka, 
and E. M. Herman, (1987) Synthesis and protein body 
deposition of maize 15-kd zein in transgenic tobacco 
seeds. EMBO J. 6:3213-3221, and U.S. Patent 5589616]. 

30 Further, it is known that the ER signal peptide derived 
from one protein can direct the translocation of a 
different protein if it is appropriately attached to the 
second protein by genetic engineering methods [D. C. Hunt 
and M. J. Chrispeels, (1991) The signal peptide of a 

35 vacuolar protein is necessary and sufficient for the 
efficient secretion of a cytosolic protein. Plant 
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Physiol. 96:18-25, and Denecke, J., J. Botterman, and R. 
Deblaere (1990) Protein secretion in plants can occur via 
a default pathway. Plant Cell 2:51-59]. Therefore, one 
may expose a protein in vivo to different biochemical 
5 environments by directing its accumulation in the cytosol 
(by not providing a signal peptide sequence) , or in the 
ER/vacuole (by provision of an appropriate signal 
peptide. ) 

The ER signal peptide of maize 15 kDa zein proteins 

10 is known to comprise the first 20 amino acids encoded by 
the zein coding region. Two examples of such signal 
peptides the ER signal peptide of 15 kDa zein from A5707 
maize, NCBI Accession # M727Q8, and the ER signal peptide 
of 15 kDa zein from Black Mexican Sweet maize, NCBI 

15 Accession # M13507. There is only a single amino acid 
difference (Ser vs Cys at residue 17) between these 
signal peptides. 

SEQ ID NO: 5 is a modified sequence coding the ER 
signal peptide of 15 kDa zein from Black Mexican Sweet 

20 maize. The modifications embodied in this sequence were 
made to accommodate the different monocot/dicot codon . 
usages and other sequence motif considerations discussed 
above in the design of the plant-optimized tcdA coding 
region. The sequence includes an additional Ala residue 

25 at position #2 to accommodate the Ncol site which spans 
the ATG start codon. 

SEQ ID NO: 6 gives a sequence coding for the full- 
length native TcdA protein (amino acids 22-2537) fused to 
the modified 15 kDa zein endoplasmic reticulum signal 

30 peptide (amino acids 1-21) . 

Example 2 

Transformation Of Tobacco With Agrobacterium Carrying 
Plasmid pDAB2041 Encoding Photorhabdus Toxins 
A. Plasmid pDAB2041 

35 Preparation of tobacco transformation vectors was 
accomplished in three steps. First, a modified plant- 
optimized tcdA coding region was ligated into a tobacco 
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plant expression cassette plasmid. In this step, the 
coding region was placed under the transcriptional 
control of a promoter functional in tobacco plant cells. 
RNA transcription termination and polyadenylation were 
5 mediated by a downstream copy of the terminator region 
from the Agrobacterium nopaline synthase gene. Two 
plasmids designed to function in this role are pDAB1507 
and pDAB2006. In the second step, the complete gene 
comprised of the promoter, coding region, and terminator 

10 region was ligated between the T-DNA borders of an 
Agrobacterium binary vector, pDABl542. Also positioned 
between the T-DNA borders was a plant selectable marker 
gene to allow selection of transformed tobacco plant 
cells. In the third step, the engineered binary vector 

15 plasmid was conjugated from its E. coli host strain into 
a disabled Agroibacterium tumefaciens strain capable of 
transforming tobacco plant cells that regenerate into 
fertile transgenic plants. 

It is a feature of plasmid pDAB1507 that any coding 

20 region having an JVcoI site at its 5' end and a Sad site 
3' to the coding region, when cloned into the unigue Ncol 
and SacI sites of pDAB1507, is placed under the 
transcriptional control of an enhanced version of the 
CaMV 35S promoter. It is also a feature of pDAB1507 that 

25 the 5' untranslated leader (UTR) sequence preceding the 
Ncol site comprises a modified version of the 5' UTR of 
the MSV coat protein gene, into which has been cloned an 
internally deleted version of the maize AdhlS intron 1. 
Additionally it is a feature of pDAB1507 that 

30 transcription termination and polyadenylation of the mRNA 
containing the introduced coding region are mediated by 
termination/Poly A addition sequences derived from the 
nopaline synthase (Nos) gene. Finally, it is a feature 
of pDAB1507 that the entire assembly of promoter/coding 

35 region/3' UTR can be obtained as a single DNA fragment by 
cleavage at the flanking NotI sites. 

-18- 



WO 01/11029 



PCT/US00/22237 



It is a feature of plasmid pDAB2006 that any coding 
region having an Ncol site at its 5' end and a Sad site 
3' to the coding region, when cloned into the unique Ncol 
and Sad sites of pDAB2006, is placed under the 
5 transcriptional control of the CaMV 35S promoter. It is 
also a feature of pDAB2006 that the 5' untranslated 
leader (UTR) sequence preceding the Ncol site comprises a 
polylinker. Additionally it is a feature of pDAB2006 
that transcription termination and polyadenylation of the 

10 mRNA containing the introduced coding region are mediated 
by termination/Poly A addition sequences derived from the 
nopaline synthase (Nos) gene. Finally, it is a feature 
of pDAB2006 that the entire assembly of promoter/coding 
region/3' UTR can be obtained as a single DNA fragment by 

15 cleavage at the flanking NotI sites. 

It is a feature of pDAB1542 that any DNA fragment 
flanked by NotI sites can be cloned into the unique NotI 
site of pDAB1542, thus placing the introduced fragment 

20 between the T-DNA borders, and adjacent to the neomycin 
phosphotransferase II (kanamycin resistance) gene. 

To prepare a plant-expressible gene to produce the 
non-targeted TcdA protein in tobacco plant cells, DNA of 
a plasmid (pA0H_4-OPTI) containing the plant-optimized 

25 tcdA coding region, (SEQ ID No: 3) was cleaved with 

restriction enzymes Ncol and Sad, and the large 7550 bp 
fragment was ligated to similarly-cut DNA of plasmid 
pDAB1507 to produce plasmid pDAB2040. DNA of pDAB2040 
was then digested with NotI, and the 8884 bp fragment was 

30 ligated to NotI digested DNA of pDABl542 to produce 

plasmid pDAB2041. This plasmid was then conjugated by 
triparental mating [Firoozabady, E., D. L. DeBoer, D. J. 
Merlo, E. L. Halk, L. N. Amerson, K. E. Rashka, and E. E. 
Murray (1987) Transformation of cotton (Gossypivm 

35 hirsutum L.) by Agrobacterium tumefaciens and 

regeneration of transgenic plants. Plant Molec. Biol. 
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10:105-116] from the host Escherichia coli strain (XL1- 
Blue f Stratagene, La Jolla, CA) , into the nontumorigenic 
Agrobacterium tumefaciens strain EHA101S, which is a 
spontaneous streptomycin-resistant mutant of strain 
5 EHA101 (Hood, E. E., G. L. Helmer, R. T. Fraley, and M. - 
D. Chilton (1986) The hypervirulence of Agrobacterium 
tumefaciens A281 is encoded in a region of pTiBo542 
outside of T-DNA. J. Bacteriol. 168:1291-1301). Strain 
EHA101S (pDAB2041) was then used to produce transgenic 

10 tobacco plants that expressed the TcdA protein. 
B. . Plasmid pRK2013 

To prepare a plant-expressible gene to produce the 
endoplasmic reticulum-targeted TcdA protein in tobacco 
plant cells, DNA of a plasmid (pA0H_4-ER) containing the 

15 plant-optimized, ER-targeted tcdA coding region, (SEQ ID 
No: 6) was cleaved with restriction enzymes Ncol and SacI, 
and the large 7610 bp fragment was ligated to similarly- 
cut DNA of plasmid pDAB2006 to produce plasmid pDAB1833. 
DNA of pDAB1833 was then digested with Afotl, and the 8822 

20 bp fragment was ligated to NotI digested DNA of pDAB1542 
to produce plasmid pDAB2052. This plasmid was then 
conjugated by triparental mating from the host 
Escherichia coli strain (XLl-Blue), into the 
nontumorigenic Agrobacterium tumefaciens strain EHA101S. 

25 Strain EHA101S (pDAB2052) was then used to produce 

transgenic tobacco plants that expressed the TcdA protein 
containing an amino terminus endoplasmic reticulum 
targeting peptide. 

30 C. Transfer of Plasmid pDAB2041 Into Agrobacterium Strain 
EHA101S 

Cultures of E. coli carrying the engineered Ti 
plasmid pDAB2041 (plasmid containing the rebuilt Toxin A 
gene, tcdA) , E. coli carrying the plasmid pRK2013, and 
35 Agrobacterium strain EHA101S were grown overnight, then 
mixed 1:1:1 on plain LB medium solidified with agar and 



WO 01/11029 



PCT/DS00/22237 



cultured in the dark at 28°C. Two days later, the lawn of 
bacteria was scraped up with a loop, suspended in plain 
LB medium, vortexed, and then diluted 1:10 4 , 1:10 5 , and 
1:10 6 fold in plain LB liquid medium. Aliquots of these 
5 dilutions were spread on selective plates containing 

medium YEP plus erythromycin (100 mg/L) and streptomycin 
(250 mg/L) and grown at 28°C. Two days later, single 
colonies were picked and streaked onto the same medium, 
then spread to give single colonies. Single colonies were 

10 picked again and streaked, then spread for single 

colonies. Single colonies were picked a third time, grown 
as streaks, then subjected to a quality analysis 
involving growth on lactose medium and chromogenic assay 
with Benedict's reagent. Of ten strains developed in this 

15 way, the fastest coloring colony was chosen for further 
work. 

D. Transformation Of Tobacco With Agrobacterium Carrying 
Plasmid pDAB2041 

20 Tobacco transformation with Agrobacterium 

tumefaciens was carried out by a method similar, but not 
identical, to published methods (R Horsch et al, 1988. 
Plant Molecular Biology Manual, S. Gelvin et al, eds., 
Kluwer Academic Publishers, Boston) . To provide source 

25 tissue for the transformation, tobacco seed {Nicotiana 
tabacum cv. Kentucky 160) were surface sterilized and 
planted on the surface of TOB- , which is a hormone-free 
Murashige and Skoog medium (T. Murashige and F. Skoog, 
1962). A revised medium for rapid growth and bioassays 

30 with tobacco tissue culture. Plant Physiol. 75: 473-497) 
solidified with agar. Plants were grown for 6-8 weeks in 
a lighted incubator room at 28-30°C and leaves were 
collected sterilely for use in the transformation 
protocol. Approximately one cm 2 pieces were sterilely cut 

35 from these leaves, excluding the midrib. Cultures of the 



-21- 



WO 01/11029 



PCT/US00/22237 



Agrobacterium strains (EHA101S containing pDAB2041), 

which had been grown overnight on a rotor at 28°C f were 

pelleted in a centrifuge and resuspended in sterile 

Murashige & Skoog salts, adjusted to a final optical 

5 density of 0.7 at 600 nm. Leaf pieces were dipped in 

this bacterial suspension for approximately 30 seconds, 

then blotted dry on sterile paper towels and placed right 

side up on medium T0B+ (Murashige and Skoog medium 

containing 1 mg/L indole acetic acid and 2.5 mg/L 

10 benzyladenine) and incubated in the dark at 28°C. Two 

days later the leaf pieces were moved to medium TOB+ 

containing 250 mg/L cefotaxime (Agri-Bio, North Miami, 

Florida) and 100 mg/L kanamycin sulfate (AgriBio) and 

incubated at 28-30°C in the light. Leaf pieces were moved 

15 to fresh TOB+ with cefotaxime and kanamycin twice per 

week for the first two weeks and once per week 

thereafter. Leaf pieces which showed regrowth of the 

Agrobacterium strain were moved to medium TOB+ with 

cefotaxime and kanamycin, plus 100 mg/1 carbenicillin 

20 (Sigma) . Four to six weeks after the leaf pieces were 

treated with the bacteria, small plants arising from 

transformed foci were removed from this tissue 

preparation and planted into medium TOB- containing 250 

mg/L cefotaxime and 100 mg/L kanamycin in Magenta GA7 

25 boxes (Magenta Corp., Chicago). These plantlets were 

grown in a lighted incubator room. After 3-4 weeks the 

primary transgenic plants had rooted and grown to a size 

sufficient that leaf samples could be analyzed for 

expression of protein from the transgene. Twenty-five 

independent transgenic events were recovered as single 

plants from the pDAB2041 transformation. 

Eight independent lines expressing various levels of 

transgenic protein from the T-DNA of pDAB2041 were 

propagated in vitro from leaf pieces as follows. Twelve 

to sixteen approximately one cm 2 pieces were sterilely cut 

from leaves of each primary transgenic plant, excluding 
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the midrib and all naturally occurring edges. These leaf 
pieces were placed on medium TOB+ containing 250 mg/L 
cefotaxime and 100 mg/L kanamycin, and cultured in the 
lighted incubator at 28-30°C for 3-4 weeks, at which time 
5 small plants could be cut from the proliferating tissue 
mass. Several small plantlets from each transgenic line 
were moved into Magenta boxes containing medium TOB- plus 
cefotaxime and kanamycin and allowed to root and grow. 
The proliferating tissue mass was further cultured on 

10 medium TOB+ with cefotaxime and kanamycin, and additional 
plants could be cut out and grown up as needed. 

Plants were moved into the greenhouse by washing the 
agar from the roots, transplanting into soil in 5 W 
square pots, placing the pot into a Ziploc bag 

15 (DowBrands), placing plain water into the bottom of the 
bag, and placing in indirect light in a 30°C greenhouse 
for one week. After one week the bag could be opened; the 
plants were fertilized and allowed to grow further, until 
the plants were acclimated and the bag was removed. 

20 Plants were grown under ordinary warm greenhouse 

conditions (30°C, 16 H light). Plants were suitable for 
sampling four weeks post transplant. 



Example 3 

25 Chacterization Of Transgenic Tobacco Plants Expressing 
Photorhabdus Toxin That Confer Insect Control. 



A. Polyclonal Antibody Production 

The E. coli produced recombinant TcdA protein was 

30 purified by a series of column purification. The protein 
was sent to Berkley Antibody Company (Richmond, CA) for 
the production of antiserum in a rabbit. Inoculations 
with the antigen were initiated with 0.5 mg of protein 
followed by four boosting injections of 0.25 mg each at 

35 about three week intervals. The rabbit serum was tested 

by the standard Western analysis using the recombinant 

TcdA protein as the antigen and enhanced chemi- 
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luminescens, ECL method (Amersham, Arlington Heights, IL 
) .The antibodies (PAb-EA 0 ) were purified using a PURE I 
antibody purification kit (Sigma, St. Luis, MO). PAb-EAo 
antibodies recognize the full-length TcdA and its 
5 processed components. 

B. Expression Of TcdA Protein In Tobacco 

Protein was extracted from the leaf tissue of 
transformed and non-transformed tobacco plants following 
the procedure described immediately below. 
10 Two leaf disks of 1.4 cm in diameter were harvested 

from the middle portion of a fully expanded leaf. The 
disks were placed on a 1.6 x 4 cm piece of 3M Whatman 
paper. The paper was folded lengthwise and inserted in a 
flexible straw. Four hundred micro liters of the 
15 extraction buffer (9.5 ml of 0.2 M NaH 2 P0 4 , 15.5 ml of 0.2 
M Na 2 HP0 4 , 2 ml of 0.5 M Na 2 EDTA, 100 ml of Triton X100, 1 
ml of 10% Sarkosyl, 78 ml of beta-me.rcaptoethanol, H 2 0 to 
bring total volume to 100 ml) was pipetted on to the 
paper. The straw containing the sample was then passed 
20 through a rolling device used for squeezing out the 

extract 1.5 mL micro centrifuge tube was placed at the 
other end of the straw to collect the extract. The 
extract was centrifuged for 10 minutes at 14,000 rpm in 
an Eppendorf regrigerated microcentrifuge. The 
supernatant was transferred into a new tube. Protein 
quantitation analysis was performed using the standard 
Bio-Rad Protein Analysis protocol (Bio-Rad Laboratories, 
Hercules, CA) . The extract was diluted to 2 mg/ml of 
total protein using the extraction buffer. 

For the detection of transgenic protein, Western 
blot analysis was performed. Following a standard 
procedure for protein separation (Laemmli, 1970), 40 \ig 
of protein was loaded in each well of 4-20% gradient 
polyacrylamide gel (Owl Scientific Co., MA) for 
electrophoresis. Subsequently, the protein was 
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transferred onto a nitrocellulose membrane using a semi- 
dry elect roblotter (Pharmacia LKB Biotechnology , 
Piscataway, NJ) - The membrane was incubated for one hour 
in Blotto (5% milk in TBST solution; 25 mM Tris HCL pH 
5 7.4, 136 mM NaCl, 2.7 mM KC1, 0.1% Tween 20). Thereafter 
, Blotto was replaced by the primary antibody solution 
(in Blotto). After one hour in the primary antibody, the 
membrane was washed with TBST for five minutes three 
times. Then the secondary antibody in Blotto (1:2000 

10 dilution of goat anti-rabbit IgG conjugated to 

horseradish peroxidase; Bio-Rad Laboratories) . was added 
to the membrane. After one hour of incubation, the 
membrane was washed with an excess amount of TBST for 10 
minutes four times. The protein was visualized by using 

15 the enhanced chemi-luminescens, ECL method (Amersham, 

Arlington Heights, IL ) . The differential intensity of 
the protein bands were measured using densitometer 
(Molecular Dynamics Inc., Sunnyvale, CA) . 

To determine the expression of TcdA protein in 

20 tobacco transformed with pDAB2041, PAb-EAo antibodies were 
used as the primary antibodies. The expression levels of 
TcdA protein varied among independent transformation 
events. The primary plant generated from the event 
#2041-13 showed the highest level of pre-pro TcdA 

25 expression of extractable protein. When the leaf pieces 
from this plant (#2041-13) were used in in vitro 
propagation, several plants were obtained. Seven of 
these plants were analyzed for the expression of the TcdA 
protein. All but one plant produced the full-length TcdA 

30 protein as well as some processed peptide components. 
Using the antibodies specific to Neomycin phospho- 
transferase, NPT (5 prime-3 prime, Boulder, Co), the 
expression the selectable marker gene (npt II) was 
detected. Similar results were obtained for #2041-29. 



Table 5 
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Western analysis of plants derived from event #2041 -13 



Plant # 


TcdA 


NPT fcelfirtflhlf* marker) 

I'll X YvvlvVUIUiv UJOl^Vl^ 


2041-13A 


+ 


not done 


2041-13B 


+ 


not done 


2041-13-1 




+ 


2041-13-2 


+ 


+ 


2041-13-3 


+ 


+ 


2041-13-4 


+ 


+ 


2041-13-5 


+ 


+ 



C. Nucleic Acid Analysis of Transgenic Tobacco Lines 
Genomic DNA was prepared from a group of 2041 
5 transgenic events . The lines included Magenta box stage 
2041-13, and greenhouse stage plants 2041-13-1, 2041-13- 
2, 2041-13-5, 2041-9, 2041-20A and 2041-20B. A 
transgenic GUS line (2023) was included as a negative 
control. Southern analysis of these lines was performed. 

10 The genomic tobacco DNA was restricted with the enzyme 
SstI which should result in a 8.9 kb hybridization 
product when hybridized to a tcdA gene specific probe. 
The 8.9 kb hybridization product should consist of the 
35T promoter and the tcdA coding region. All 2041 plants 

15 contained a band of the expected size. Events 2041-9 and 
-20 appear to be the same line with 5 identical 
hybridizing bands. Event 2041-13 produced 6 
hybridization fragments with the tcdA coding region 
probe. Magenta box and various greenhouse plants of 

20 2041-13 all produced the same hybridization profile. 
This hybridization pattern was different from that of 
events 2041-9 and -20. 

RNA analysis, using the tcdA coding region probe, 
was performed on the same group of greenhouse 2041 

25 plants. Immunoblot analysis had revealed that plants 
2041-9, 2041-20A, 2041-20B, and 2041-13-1 produced no 
detectable TcdA protein; while 2041-13-2 and 2041-13-5 
produced substantial amounts of full-length TcdA. 
Northern analysis was in agreement with the immunoblot 
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result. A faint RNA signal was detected for plants 2041- 
9, 2041-20A, 2041-20B, and 2041-13-1. Only faintly 
visible was a band corresponding to full-length tcdA 
transcript in plant 2041-13.1. In contrast, for plants 
5 2041-13-2 and 2041-13-5 a strong RNA signal was detected, 
with a substantial amount of full-length size (-8.0 kb) 
tcdA transcript. These data support the observed 
bioassay activity for this group of plants. 

Genomic DNA was prepared from a second functionally 

10 active 2041 transgenic event, 2041-29. Southern analysis 
of this line was performed. A transgenic GUS line (2023) 
was included as a negative control, DNA of line 2041-9 
was included as a positive control. 

The genomic tobacco DNAs were restricted with the 

15 enzyme SstI which should result in a 8.9 kb hybridization 
product when hybridized to a tcdA gene specific probe. 
The 8.9 kb hybridization product should consist of the 
35T promoter and the tcdA coding region. For plant 2041- 
29-5, three hybridization products larger than 8.9 kb the 

20 were detected with the tcdA gene specific probe. 

Immunoblot analysis has demonstrated pre-pro TcdA protein 
is made by this plant, it is therefore likely that a 
restriction site was lost during transformation or 
regeneration, or the 2041-29 genomic DNA was not 

25 thoroughly digested. 

D. Tobacco Leaf-Disk Tests With Tobacco Hornworm 

Exhibiting Insect Control 

Leaves were sampled from tobacco plants, Nicotians 
30 tabaco, previously transplanted into the greenhouse. A 

single leaf was sampled from each plant on each test 

date. Leaves were selected from the zone where younger 

elongate leaves transition into older ovate leaves. 

Excised leaves were placed into 12 oz. cups with the 
35 petiole submerged in water to maintain turgor, and 

transported to the laboratory. 
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Eight, 1.4 cm disks were cut from the center portion 
of one side of each leaf (right adaxial side up, with 
distal portion facing away from the observer) . Each disk 
was placed individually into a well of a C-D 
5 International 128 well tray (Pitman, NJ.) into which 0.5 
ml of a 1.6% aqueous agar solution had been previously 
pipetted. The solidified agar prevented the leaf disks 
from drying out. The adaxial surface of the disk was 
always oriented up. 

10 A single neonate tobacco hornworm, Manduca sexta, 

was placed on each disk and the wells were sealed with 
vented plastic lids. The assay was held at 27°C and 40% 
RH. Larval mortality and live-weight data were collected 
after 3 days. Data were subjected to analysis of 

15 variance and Duncan's multiple range test (a = 0.05) (Proc 
GLM, SAS Institute Inc., Cary, NC). Data were 
transformed using a logarithmic function to correct a 
correlation between the magnitude of the mean and 
variance. 

20 Table 6 

Results of leaf-disk assays from greenhouse grown tobacco 
plants with event 2041-13. 



Weight of Surviving Larvae (mg) & Duncan's Group 



TRT 


Plant 


Plant 


Pretes 


Test 1 


Test 2 


Test 3 


3 Test 






Age 


t 








Sura 


13 


non-transformed - 2 


young 








18.8 a* 




14 


non-transformed - 3 


young 








17.0 ab 




16 


non-transformed - 5 


young 








16.4 ab 




3 


2041-13-1 (western-) 


young 




17.6 a 


18.2 a 


16.1 ab 


17.3 a 


9 


Gus Control 


old 


19.3 a 


14.6 a 


16.3 a 


14.5 ab 


15.1a 


10 


non-transformed - 1 


young 




8.3 b 


16.8 a 


13.9 b 


13.0 b 


11 


2041-20B (western-) 


old 




10.0 b* 


13.7 ab 


14.6 ab 


12.9 b 


15 


non-transformed - 4 


young 








13.0 be 




8 


2041-20A (western-) 


old 


15.7 a 


8.3 b 


lObc 


9.2 cd 


9.6 c 


12 


2041-9 (western-) 


old 


19.5 a 






7.9 d 




7 


2041-13-5 (western +) 


young 




6.3 be 


9.6 cd 


7.2 de 


7.7 d 


5 


2041-13-3 (western +) 


young 




6.4 


6.2 e 


6.8 de** 


6.4 de 








be**** 








1 


204 1-1 3A (western +) 


old 


7.2 b 


6.8 be* 


7.0 de* 


5.4 e 


6.4 de 


6 


2041-13-4 (western +) 


young 




4.9 c**** 


5.8 e 


7.6 d 


6.4 de 


4 


2041-13-2 (western +) 


young 




5.7 be 


5.7 e** 


7.5 d 


6.3 de 


2 


2041-13B (western +) 


old 




4.7 c** 


5.6 e 


7.2 de 


5.9 e 



* Number of stars corresponds to the 
larvae per 8 tested. 
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1. Data transformed (logarithm) for analysis. 

Means followed by the same letter are not significantly 

different (alpha =0.05). 

5 TABLE 7 

Results Of Leaf-Disk Assays From Greenhouse Grown Tobacco 

Plants 
With Event 2041-29. 



| MEAN WGT (MG) / Duncan's Group 



Plant Testl Test 2 Test 3 Test 4 Four Test 

Summary 


2014-6 GUS1 


15.8 a 


16.6a 


**5.5bc 


*12.9ab 


13.2 a 


2014-6 GUS 2 


14.4 a 


♦6.6 be 


*13.4a 


15.2a 


12.6 a 


KY-160NTC 


13.4 a 


6.7 be 


7.9b 


8.5bc 


9.1b 


2041-29 4P 


»4.9b 


•7.3b 


♦***6.9b 


******** 


6.3 c 


2041-29 7 


♦5.9 b 


5.1bc 


***6.7b 


***7.2c 


6.1c 


2041-29 3P 


♦5.6 b 


**7.9b 


*****6.5b 


♦**3.6d 


5.9 c 


2041-29 2P 


6.3 b 


****4.7c 


******4.1c 


******4.6d 


5.4 c 



* Number of stars corresponds to the number of dead 
10 larvae per 8 tested, 

1. Data transformed (logarithm) for analysis. 

Means followed by the same letter are not significantly 

different (alpha = 0.05). 

All event 2041-29 plants significantly depressed THW 
15 larval weight gain compared to control plants. Average 
weight depression was 49%. Statistically significant 
mortality occurred in THW larvae exposed to foliage from 
2041-29 plants. Mortality averaged 37.5% compared to 
5.2% in controls. 

20 

E. Isolation and Characterization of Functional 
Photorhabdus Toxin Protein From Transgenic Plants 

Seven grams of transgenic tobacco plants (2041-13) 
expressing TcdA (Toxin A) gene were homogenized with 10 
ml 50 mM Potassium Phosphate buffer, pH 7.0 using a bead 
beater (Biospec Products, Bartlesville, OK) according to 
manufacturer's instructions. The homogenate was filtered 
through four layers of cheese cloth and then centrifuged 
at 35,000 g for 15 min. The supernant was collected and 
filtered through 0.22 |im Millipore Express™ membrane. It 
was then applied to a Superdex 200 cloumn (2.6 x 40 cm) 
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wh.ich had been equilibrated with 20 mM Tris buffer, pH 
8*0 (Buffer A). The protein was eluted in Buffer A at a 
flow rate of 3 ml/min. Fractions with 3 ml each were 
collected and subjected to southern corn rootworm (SCR) 
5 bioassay. It was found that fractions corresponding to a 
native molecular weight around 860 kDa had the highest 
insecticidal activity. Western analysis of the active 
fraction using a polyclonal antibody specific to Toxin A 
indicated the presence of full-length TcdA peptide. The 

10 active fractions were further combined and applied to a 
Mono Q 10/10 column which had been equilibrated with 
Buffer A, Proteins bound to the column were then eluted 
by a linear gradient of 0 to 1 M NaCl in Buffer A. 
Fractions with 2 ml each were collected and analyzed by 

15 both SCR bioassay and Western using antibody specific to 
Toxin A. The results again demonstrated the correlation 
between insecticidal activity and presence of full-length 
TcdA peptide. 

20 F. Characterization of Progeny Transgenic Plants 

The inheritability of the genetically engineering 
plants containing the Photorhabdus toxin gene was 
evaluated by generating Fl progeny. Progeny was 
generated from 2041-13 event by selfing expression 
positive plants. The 2041-13 plants in the greenhouse 
were allowed to self-pollinate. Seed capsules were 
collected when mature and were allowed to dry and after- 
ripen on the laboratory bench for two weeks. Seed from 
plant designated 2041-13A was surface-sterilized and 
distributed on the surface of medium TOB- without 
selection, to allow recovery of nonexpressing or 
nontransgenic progeny as well as expressing and 
segregating transgenic siblings. Seed was germinated in 
a C lighted incubator room (16 H light, 28 C) . After 1 
month, fifty-one seedlings, designated 2041-13A-S1 
through S51, were distributed into Magenta boxes 
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containing medium TOB- to grow further. Three weeks 
later, leaf samples from these Magenta-box grown 
seedlings were submitted for evaluation of the level of 
expression of TcdA toxin. 
5 Leaf samples were tested for kanamycin response by 

placing sterile leaf segments on medium TOB+ containing 
100 mg/L kanamycin in the light and scoring for tissue 
growth and color after two weeks. All leaf pieces showed 
some positive response, indicating complex segregation. 

10 This group of in vitro grown event 2041-13 progeny 

seedlings were all transplanted into the greenhouse 
approximately two months after seeding onto medium, using 
the following method. After washing the agar from the 
roots, ^plants were transplanted into 5 inch square pots 

15 in a soil mix containing 75% MetroMix and 25% mineral 
soil. They were enclosed in a zip-lock bag and plain 
water added to leave 1-2 inches of water in the bottom of 
the bag after soil absorption. These bags were closed and 
placed under a cart in the greenhouse to protect them 

20 from direct sunlight. The bags were opened after 5-6 
days, and removed after 7 days, when the plants were 
adapted to soil and were moved to the top of the cart for 
normal greenhouse culture. Plants were ready to test in 
insect bioassays at four weeks post transplant. 

25 Fl progeny were evaluated for expression of protein 

toxin by immunological screen and for biological activity 
by plant bioassays, as described previously, using 
tobacco hornworm. There existed a positive correlation 
between levels of expression protein toxin and degree of 

30 growth inhibition and at higher expression levels 

mortality was observed. The biological activity was 
observed to be statistical significance with high 
cofidence levels between populations of non-transformed 
and transformed expressing protein toxin. 

35 The following table summarizes the results of insect 

(tobacco hornworm) bioassays conducted with Fl progeny of 
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self-fertilized 2041-13 plants genetically engineered to 
produce the u 204" A toxin. The tests included 6 non- 
expressing progeny (protein-negative controls) , 45 toxin 
A expressors, and 4 non-transformed controls (KY-160) . 
5 Results are from three leaf-disk assays (method 

previously outlined) where eight disks were used per 
test. The data were analyzed using analysis of variance 
and were blocked by test. 

The treatment effect for each of these analyses 

10 indicated the Pr > F was less than 0.0001. The Toxin A 
expressors produced significant control of tobacco 
hornworm compared to each of the control groups based on 
each of the three measures of efficacy. The two control 
groups behaved similarly. Statistical analysis using 

15 ANOVA and an LSD test with alpha equal to 0.01 (or 1%) 
showed differences between the 3 groups. The LSD test 
indicated that the non-expressors and the non-transformed 
plants were similar in larvae weights but the expressors 
gave weights significantly lower than either of the other 

20 two groups of plants. These data demonstrated that the 
genetic basis for insect control was inheritable and 
corresponded to the presence of expressed toxin gene. 

Table 8 

Tobacco hornworm results from Fl progeny of self- 
25 fertilized 



2041-13 tobacco plants. 





Mean Value and Duncan's Grouping** 


Treatment Group 


Total Weight (rag) 1 


Survivor Weight (mg) b 


Leaf Area (cm 2 ) 0 


Non-transformed Control 


15.8 a 


15.8 a 


lia 


Protein-negative Control 


16.4 a 


16.5 a 


1.2 a 


Toxin A Expressor 


8.1b i 


9.2 b 


4.9 b 



a Average insect weight with dead insects considered to 
weigh nothing. 

b Average insect weight with dead insects excluded from 



30 analysis. 

c Total leaf area remaining per eight leaf disks. Initial 
area was approximately 12 cm 2 . 

d Means followed by the same letter are not significantly 
different (alpha = 0.05). 
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Example 4 

Transformation Of Maize With a Vector Carrying Plasmid 
5 pDAB1834 Encoding Photorhabdus Toxins 

A. Preparation Of Maize Transformation Vectors 
Containing Modified Plant -Optimized Tcda Coding Regions: 
Plasmid Pdabl834 

10 

Preparation of maize transformation vectors was 
accomplished in two steps. First, a modified plant- 
optimized tcdA coding region was ligated into a plant 
expression cassette plasmid. In this step, the coding 

15 region was placed under the transcriptional control of a 
promoter functional in maize plant cells. RNA 
transcription termination and polyadenylation were 
mediated by a downstream copy of the terminator region 
from the Agrobacterium nopaline synthase gene. One 

20 plasmid designed to function in this role is pDAB1538. In 
the second step, the complete gene comprised of the 
promoter, coding region, and 3' UTR terminator region was 
ligated to a plant transformation vector that contained a 
plant expressible selectable marker gene which allowed 

25 the selection of transformed maize plant cells amongst a 
background of nontransformed cells. An example of such a 
vector is pDAB367. 

It is a feature of plasmid pDAB1538 that any coding 
region having an Ncol site at its 5' end and a SacI site 

30 3' to the coding region, when cloned into the unique Ncol 
and SacI sites of pDAB1538, is placed under the 
transcriptional control of the maize ubiquitinl (ubil) 
promoter. It is also a feature of pDAB1538 that the 5' 
untranslated leader (UTR) sequence preceding the Ncol 

35 site comprises a polylinker. Additionally it is a 

feature of pDAB1538 that transcription termination and 
polyadenylation of the mRNA containing the introduced 
coding region are mediated by termination/Poly A addition 
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sequences derived from the nopaline synthase (Nos) gene. 
Finally, it is a feature of pDAB1538 that the entire 
assembly of promoter/coding region/3' UTR can be obtained 
as a single DNA fragment by cleavage at the flanking NotI 
5 sites. 

It is a feature of pDAB367 that the phosphinothricin 
acetyl transferase protein, which has as its substrate 
phosphinothricin and related compounds, is produced in 
plant cells through transcription of its coding region 

10 mediated by the Cauliflower Mosaic Virus 35S promoter and 
that termination of transcription plus polyadenylation 
are mediated by the nopaline synthase terminator region. 
It is further a feature of pDAB367 that any DNA fragment 
containing flanking NotI sites can be cloned into the 

15 unique NotI site of pDAB367, thus physically linking the 
introduced DNA fragment to the aforementioned selectable 
marker gene. 

To prepare a maize plant-expressible gene to produce 
the endoplasmic reticulum-targeted TcdA protein in plant 

20 cells, DNA of a plasmid (pA0H_4-ER) containing the plant- 
optimized, ER-targeted tcdA coding region, (SEQ ID No: 6) 
was cleaved with restriction enzymes Ncol and SacI, and 
the large 7 610 bp fragment was ligated to similarly-cut 
DNA of plasmid pDAB1538 to produce plasmid pDAB1832. DNA 

25 of pDAB1832 was then digested with NotI, and the 9984 bp 
NotI fragment was ligated into the unique NotI site of 
pDAB367 to produce plasmid pDAB1834. 

It is a feature of plasmids pDAB1834 that the ubil 
and 35S promoters are encoded on the same DNA strand. 

30 

B. Transformation and Regeneration of Transgenic Maize 
Isolates 

Type II callus cultures were initiated from immature 
zygotic embryos of the genotype "Hi-II." (Armstrong et 
35 al, (1991) Maize Genet. Coop. Newslett., 65: 92-93). 
Embryos were isolated from greenhouse-grown ears from 
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crosses between Hi-II parent A and Hi-II parent B or F 2 
embryos derived from a self- or sib-pollination of a Hi- 
ll plant. Immature embryos (1.5 to 3.5 mm) were cultured 
on initiation medium consisting of N6 salts and vitamins 
5 (Chu et al, (1978) The N6 medium and its application to 
anther culture of cereal crops. Proc. Symp. Plant Tissue 
Culture, Peking Press, 43-56), 1.0 mg/L 2,4-D, 25mM L- 
proline, 100 mg/L casein hydrolysate, 10 mg/L AgN03, 2.5 
g/L GELRITE (Schweizerhall, South Plainfield, NJ) , and 20 

10 g/L sucrose, with a pH of 5.8. After four to six weeks 
callus was subcultured onto maintenance medium 
(initiation medium in which AgN0 3 was omitted and L- 
proline was reduced to 6 mM) . Selection for Type II 
callus took place for ca. 12-16 weeks, 

15 Plasmid pDAB1834 was transformed into embryogenic 

callus. For blasting, 140 pg of plasmid DNA was 
precipitated onto 60 mg of alcohol-rinsed, spherical gold 
particles (1.5 - 3.0 pm diameter, Aldrich Chemical Co., 
Inc., Milwaukee, WI) by adding 74 pL of 2.5M CaCl 2 H 2 0 and 

20 30 pL of 0.1M spermidine (free base) to 300 pL of plasmid 
DNA and H2O. The solution was immediately vortexed and 
the DNA-coated gold particles were allowed to settle. 
The resulting clear supernatant was removed and the gold 
particles were resuspended in 1 ml of absolute ethanol. 

25 This suspension was diluted with absolute ethanol to 
obtain 15 mg DNA-coated gold/mL. 

Approximately 600 mg of embryogenic callus tissue 
was spread over the surface of Type II callus maintenance 
medium as described herein lacking casein hydrolysate and 

30 L-proline, but supplemented with 0.2 M sorbitol and 0.2 M 
mannitol as an osmoticum. Following a 4 h pre-treatment, 
tissue was transferred to culture dishes containing 
blasting medium (osmotic media solidified with 20 g/L TC 
agar (PhytoTechnology Laboratories, LLC, Shawnee Mission, 

35 KS) instead of 7 g/L GELRITE. Helium blasting 

accelerated suspended DNA-coated gold particles towards 
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and into the prepared tissue targets. The device used 
was an earlier prototype of that described in US Patent 
5,141,131 which is incorporated herein by reference. 
Tissues were covered with a stainless steel screen (104 
5 um openings)and placed under a partial vacuum of 25 

inches of Hg in the device chamber. The DNA-coated gold 
particles were further diluted 1:1 with absolute ethanol 
prior to blasting and were accelerated at the callus 
targets four times using a helium pressure of 1500 psi, 
10 with each blast delivering 20 uL of the DNA/gold 

suspension. Immediately post-blasting, the tissue was 
transferred to osmotic media for a 16-24 h recovery 
period. Afterwards, the tissue was divided into small 
pieces and transferred to selection medium (maintenance 
15 medium lacking casein hydrolysate and L-proline but 
containing 30 mg/L BASTA® (AgrEvo, Berlin, Germany) ) . 
Every four weeks for 3 months, tissue pieces were non- 
selective^ transferred to fresh selection medium. After 
7 weeks and up to 22 weeks, callus sectors found 
20 proliferating against a background of growth-inhibited 

tissue were removed and isolated. The resulting BASTA®- 
resistant tissue was subcultured biweekly onto fresh 
selection medium. Following western analysis, positive 
transgenic lines were identified and transferred to 
25 regeneration media. Western-negative lines underwent 
subsequent RNA spot blot analysis to identify negative 
controls for regeneration. 

Regeneration was initiated by transferring callus 
tissue to cytokinin-based induction medium, which 
30 consisted of Murashige and Skoog salts, hereinafter MS 

salts, and vitamins (Murashige and Skoog, (1962) Physiol. 
Plant. 15: 473-497) 30 g/L sucrose, 100 mg/L myo- 
inositol, 30 g/L mannitol, 5 mg/L 6-benzylaminopurine, 
hereinafter BAP, 0.025 mg/L 2,4-D, 30 mg/L BASTA®, and 
35 2.5 g/L GELRITE at pH 5.7. The cultures were placed in 
low light (125 ft-candles) for one week followed by one 
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week in high light (325 ft-candles) . Following a two 
week induction period, tissue was non-selectively 
transferred to hormone-free regeneration medium/ which 
was identical to the induction medium except that it 
5 lacked 2,4-D and BAP, and was kept in high light. Small 
(1.5-3 cm) plantlets were removed and placed in 150x25 mm 
culture tubes containing SH medium (SH salts and vitamins 
(Schenk and Hildebrandt, (1972) Can. J. Bot. 50:199-204), 
10 g/L sucrose, 100 mg/L myo-inositol, 5 mL/L FeEDTA, and 

10 2.5 g/L GELRITE, pH 5.8). Plantlets were transferred to 
12 cm pots containing approximately 0.25 kg of METRO-MIX 
360 (The Scotts Co. Marysville, OH) in the greenhouse as 
soon as they exhibited growth and developed a sufficient 
root system. They were grown with a 16 h photoperiod 

15 supplemented by a combination of high pressure sodium and 
metal halide lamps, and were watered as needed with a 
combination of three independent Peters Excel fertilizer 
formulations (Grace-Sierra Horticultural Products 
Company, Milpitas, CA) . At the 6-8 leaf stage, plants 

20 were transplanted to five gallon pots containing 

approximately 4 kg METRO-MIX 360, and grown to maturity. 

EXAMPLE 5 

Characterization Of Transgenic Maize Plants 
25 Expressing Photorhabdus Toxin That Confer Insect Control. 
A. Insect Bioassays 

A single leaf was sampled from each plant in each 
test. Eight, 1.4 cm disks were cut from the outer portion 
of each leaf (approximately 30cm long) avoiding the 
30 center vein. Each disk was placed individually into a 
well of a C-D International 128 well tray (Pitman, NJ. ) 
into which 0.5 ml of a 1.6% aqueous agar solution had 
been previously pipetted. The solidified agar prevented 
the leaf disks from drying out. The adaxial surface of 
35 the disk was always oriented up. 
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Five neonate southern corn rootworms, Diabrotica 
undecimpunctata howardi, were placed on each disk and the 
wells were sealed with vented plastic lids- The assay 
was held at 27°C and 40% RH. Larval mortality and live- 
5 weight data were collected after 3 days. Data were 
subjected to analysis of variance and Duncan's multiple 
range test (a = 0.05) (Proc GLM, SAS Institute Inc., Cary, 
NC). Weight data were transformed using a logarithmic 
function to correct a correlation between the magnitude 
10 of the mean and variance. 

TABLE 9 



Results of Maize Leaf-disk Test vs SCR 



Treatment 


Mean % Kill 
(Duncan' s) 


Mean Survival 
Weight (mg) 
(Duncan' s) 


1834 - 11 


68 A 


0.064 A 


1834 - 17 


44 B 


0.098 B 


1834 - 15 


26 BC 


0.127 C 


Hill control 


13 C 


0.161 C 



Note: Means followed by the same letter are not 



15 significantly different based on Duncan's multiple range 
test (alpha=0.05) . Insect groups weighing less than 0.1 
mg were set to 0.03 mg instead of zero to conduct a more 
conservative analysis. Mortality (arcsin(sqrt) ) and 
weight (loglO) data were transformed for analyses. 

20 

The results shown in Table 9 demonstrated that two events 
expressing TcdA protein were statistically distinct from 
control lines bioassayed using SCR neonates by mortality and 
survival weight criteria. These results demonstrated that 
25 southern corn rootworm were functionally effected by feeding 
on maize plants containing and expressing the tcdA gene. 
Those plants from 1834-11 were used to generate progeny for 
testing of inheritability of transgene. 
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B. PRODUCTION AND PROGENY TEST OF tcdA TRANSGENIC MAIZE 

Origin and growth of progeny plants: Sibling plants 1834- 
11-07 and 1834-11-08, clonally derived by regeneration 
5 from the callus of transgenic maize event 1834-11, were 
transplanted to the greenhouse and pollinated with inbred 
OQ414. Seeds obtained from these crosses, comprising seed 
lots 1834-11-07A and 1834-11-08A, were planted in 
Rootrainers (1 % inch x 2 inch x 8 inch deep, product 

10 #647, C. Hummert Intl., Earth City, Mo.) filled with 

Metro-Mix 360 soilless mix (Scotts Terra-Lite, available 
from Hummert Intl.) and top irrigated with Hoagland's 
nutrient solution* (Hoagland's solution contains 229 ppm 
nitrogen as nitrate, 24.6 ppm nitrogen as ammonium, 26 

15 ppm P, 157 ppm K, 187 ppm Ca, 49 ppm Mg. and 30 ppm Na.) 

Greenhouse conditions for this trial were: 16 hour 
days, daylight supplemented by metal halide lamps as 
needed to achieve a minimum of 600 ?Einsteins/cm a PAR, and 
ambient temperature 30 C days, 22 C nights. 

20 

Leaves were sampled for protein determination 
approximately one week after planting. Leaf bioassays 
were conducted 2-3 weeks after planting; root bioassays 
were initiated approximately 3 weeks post planting. 

25 

Protein analysis of progeny plants: Protein was extracted 
from leaf and root samples harvested from transgenic 
plants, line 1834-11 progenies, and non-transformed 
plants. Each sample was placed on a 1.6 x 4 cm piece of 

30 3M Whatman™ paper. The paper was folded lengthwise and 
inserted in a flexible straw. A volume of 350 p.1 of an 
extraction buffer (9.5 ml of 0.2 M NaH 2 P0 4 , 15.5 ml of 0.2 
M Na 2 HP0 4 , 2 ml of 0.5 M Na 2 EDTA, 100 ml of Triton X-100, 
1 ml of 10% Sarkosyl, 78 ml of beta-mercaptoethanol, H 2 0 

35 to bring total volume to 100 ml, 50 ng/ml Antipain, 50 
Hg/ml Leupeptin, 0.1 mM Chymostatin, 5 jig/ml Pepstatin) 
was pipetted on to the paper. The straw containing the 

-39- 



WO 01/11029 



PCT/DS00/22237 



sample was then passed through a rolling device used for 
squeezing the extract into a 1.5 ml microcentrifuge tube. 
The extract was centrifuged for 10 minutes at 14,000 rpm 
in an Eppendorf refrigerated micro-centrifuge. The 
5 supernatant was transferred into a new tube. The amount 
of the total extractable protein was determined using a 
standard BioRad Protein Analysis protocol (BioRad 
Laboratories, Hercules, CA) . 

The presence of the TcdA protein was visualized by 

10 Western blot analysis following a standard procedure for 
protein separation (Laemmli, 1970). A volume of twenty 
\xl of extract was loaded in each well of 4-20% gradient 
polyacrylamide gel (Owl Scientific Co., MA) for 
electrophoresis. Subsequently, the protein was 

15 transferred onto a nitrocellulose membrane using a semi- 
dry electroblotter (Pharmacia LKB Biotechnology, 
Piscataway, NJ) . The membrane was incubated for one hour 
in TBST-M solution (10% milk in TBST solution; 25 mM Tris 
HCL pH 7.4, 136 mM NaCl, 2.7 mM KC1, 0.1% Tween 20). 

20 Thereafter, the primary antibody (Anti-TcdA in TBST-M) 
was added. After one hour, the membrane was washed with 
TBST for five minutes, three times. Then the secondary 
antibody solution (goat anti-rabbit IgG conjugated to 
horseradish peroxidase; Bio-Rad Laboratories, in TBST-M) 

25 was added to the membrane. After one hour of incubation, 
the membrane was washed with an excess amount of TBST for 
10 minutes, four times. The protein was visualized using 
the Super Signal® West Pico chemiluminescence method 
(Pierce Chemical Co., Rockford, IL) . The protein blot 

30 was exposed on a Hyper-film (Amersham, Arlington Heights, 
IL) and was developed within 3 minutes. The intensity of 
the protein band was measured using a densitometer 
(Molecular Dynamics Inc., Sunnyvale, CA) and compared to 
standards. 

35 Three of six plants from seed lot 1834-11-07A and 

three of six plants from seed lot 1834-11-08A produced 
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detectable levels of TcdA protein (Table 1). 
Approximately 3.8 to 13.3 ppm of TcdA were detected in 
the leaf blades and 4.1 to 8.4 ppm were detected in the 
leaf tips of the protein-positive plants. The amounts of 
5 TcdA protein detected in the roots were slightly lower 
than those found in the leaves. 

Insect bioassays with progeny plants: Plants were 
selected for bioassay based on results from Western blot 

10 analysis. Twelve (12), 6.4 mm diameter leaf discs were 
cut from the youngest leaf of each 2 week old seedling. 
Each disc was placed in a well of a 128 -well tray (CD 
International) containing approximately O.SmL of a 
solidified 2% agar in water solution. Two neonate 

15 southern corn rootworm, Diabrotica undecimpunctata 

howardi (Barber) (SCR) , were placed in each well with a 

leaf disc. Trays were covered with perforated lids and 
maintained under a controlled environment for 3 days (28 
C; 16 hours light: 8 hours dark; approx. 60% relative 

20 humidity) . Living larvae from 4 leaf discs were pooled 
and weighed producing 3 weight determinations per plant. 
Average weights were calculated by dividing the pooled 
weight by the number of survivors. Differences in 
average weights of SCR fed leaf discs from protein 

25 positive and protein negative plants were assessed using 
analysis of variance on the natural log -trans formed 
average weights (Minitab, v. 12.2, Minitab Inc., State 
College, PA) . 

30 Root bioassays were initiated approximately 1 week 

after the initiation of the leaf disc bioassays. 
Approximately 24h prior to eclosion, . SCR eggs were 
suspended in a 0.15% solution of agar in water to a 
concentration of 100 eggs/ml. Plants were inoculated 

35 with SCR eggs by pipetting 2.0 ml of the egg suspension 
(ie., approximately 200 eggs) just below the soil surface 
at the base of each plant. Two weeks after inoculation, 
plants were removed from their Rootrainer pots, their 

-41- 



WO 01/11029 



PCT/US00/22237 



roots washed free of potting mix, and scored for root worm 
damage based on a 1 (resistant) to 9 (susceptible) rating 
system (Welch, 1977) . The results of the root ratings 
were examined using non-parametric tests to determine if 
5 the distribution of root ratings from the protein 

positive plants was the same as the distribution of the 
ratings from the protein negative plants. Testing was 
done at the 5% significance level. (StatXact v. 3, CYTEL 
Software Corporation, Cambridge MA) 

10 

Results from leaf and root bioassays of tcdA protein 
positive and protein negative progeny plants are 
summarized in Table 10. The average weights of SCR 
larvae fed leaf discs from protein positive plants were 
15 significantly lower than those of larvae fed leaf discs 
from protein negative plants (F * 4.6; d.f. = 1, 34; P < 

0.001. The Kolmogorov-Smirnov 2 sample test (p=0.04) and 
the Wald Wolfowitz runs test (p=0.001) indicated that 
the protein positive and protein negative root rating 
20 distributions were not similar. The Wilcoxon- Mann- 
Whitney test (p=0.0206) and the Normal Scores test 
(p=0.206) indicated that the average score for the 
protein positive plants was lower than the average root 
rating from the protein negative plants. 

25 

Table 10. Protein analysis and insect bioassay results 
with progeny of TcdA transgenic maize. 



Plant 


TcdA 


Leaf Disc 


Root Bioassay 






Bioassay 




Number 


Protein 


Avg. Wt. (mg) 


Root Rating 








(1-9) 


1834-11-07A-30 


PRO- 


0.190 


8 


1834-11-08A-21 


PRO- 


0.196 


9 


1834-11-08A-16 


PRO- 


0.195 


9 


1834-11-08A-14 


PRO- 


0.137 


9 


1834-11-07A-22 


PRO- 


0.208 


9 


1834-11-07A-20 


PRO- 


0.175 


9 
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1834-11-07A-26 


PRO+ 


0.118 


9 


1834-11-08A-17 


PRO+ 


0.132 


8 


1834-11-07A-14 


PRO+ 


0.110 


2 


1834-11-07A-11 


PRO+ 


0.106 


4 


1834-11-08A-28 


PRO+ 


0.129 


8 


1834-11-08A-27 


PRO+ 


0.108 


4 



DNA analysis of progeny plants: Leaf samples from 1834- 
11. 7A and 1834-11. 8A progeny plants were in conical 50 ml 
polypropylene tubes and dried in a Labconco Freeze Dry 
5 Lyophilizer (Kansas City, MO) for 1-2 days. Lyophilized 
leaves were then ground in a Tecator Cyclotec 1093 Sample 
mill grinder (Hoganas, Sweden) and stored at -20C. 
Genomic DNA was extracted by the following procedure: (1) 
to a 25 ml Conical tube containing 300-500 mg of ground 

10 tissue, 9 ml of CTAB (cetyl trimethylammonium bromide 

solution) was added, and incubated at 65°C for 1 hour; (2) 
4.5 ml of chloroform: octanol (24:1) was added and mixed 
gently for 5 minutes; (3) samples were centrifuged at 
2000 rpm and DNA was precipitated from the supernatant 

15 with an equal volume of isopropanol; (4) DNA was 
collected on a glass hook, washed in ethanol, and 
dissolved in TE (10 mM Tris.HCl, 0.5 mM EDTA, pH8.0). 

Genomic DNA was digested at 37 °C. for 2 hours in an 
20 Eppendorf tube containing the following mixture: 

8 nl of 800ug/ml DNA, 2 ^1 1 mg/ml BSA (Bovine serum 
albumin), 2 jil lOx buffer, 1 fil Sad, 1 \il EcoRI, and 6 (il 

H20. Digested DNA samples were electrophoresed overnight 
at 40 mA in a 0.85% SeaKem LE agarose gel (PMC, Rockland, 

25 Maine) . The gel was blotted onto Millipore Immobilon-Ny+ 
(Bedford, MA) membrane overnight in 20X SSC (NaCl 175.2 
g/1, Na citrate 88 g/1) . The probe DNA was cut with 
BamHI/SacI (NEB, Beverly, MA) from pDAB1551 plasmid, 
which released a 7356 bp fragment containing the open 

30 reading frame of the rebuilt tcdA gene. This 7356 bp 

fragment was labeled with P32 using a Stratagene Prime- it 
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RmT dCTP-Labeling Reactione kit (La Jolla, CA) and used 
for Southern hybridization. Hybridization was conducted 
in hybridization buffer (10% polyethylene glycol , 7% SDS 
[Sodium dodecyl sulfate], 0.6X SSC, 10 mM NaP0 4 , 5 mM 
5 EDTA, 10 ng/ml denatured salmon sperm) at 60 °C overnight. 
After hybridization, the membrane was washed with 10X SSC 
plus 0.1% SDS at 60 °C for 30 min and exposed to X ray 
film (Hyperfilm® MP, Amershan Life Sciences, Piscataway, 
NJ) for 1-2 days, 

10 

Results summarized indicate that a pattern of 8 
hybridizing bands (the size of the expected fragment and 
larger) cosegregated with protein expression in 50% of 
all progeny assayed. These results are characteristic of 
15 a complex insertion at a single site. All seedlings 
containing the insert also expressed toxin protein. 

Example 6 

Transformation Of Rice With a Vector Carrying Plasmid 
20 pDAB1553 Encoding Photorhabdus Toxins 

A. Plasmid pDAB1553 

Plasmid pDAB1553 containing tcdA driven by the maize 

ubiquitinl promoter and hpt (hygromycin 

25 phosphotransferase providing resistance to the antibiotic 

hygromycin) under the control of 35T (a modified 35S 

promoter), was used for transformation. 

Preparation of rice transformation vectors was 
30 accomplished in two steps. First, a modified plant- 
optimized tcdA coding region was ligated into a rice 
plant expression cassette plasmid. In this step, the 
coding region was placed under the transcriptional 
control of a promoter functional in plant cells. RNA 
35 transcription termination and polyadenylation were 
mediated by a downstream copy of the terminator region 
from the AgroJbacterium nopaline synthase gene. One 
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plasmid designed to function in this role is plasmid 
pDAB1538 (described in the section on maize 
transformation vectors) . In the second step, the 
complete gene comprised of the promoter, coding region, 
5 and terminator region was ligated to a rice plant 
transformation vector that contained a plant expressible 
selectable marker gene which allowed the selection of 
transformed rice plant cells amongst a background of 
nontransformed cells. An example of such a vector is 

10 pDAB354-Notl. 

It is a feature of pDAB354-Notl that the hygromycin 
phosphotransferase protein, which has as its substrate 
hygromycin B and related compounds, is produced in plant 
cells through transcription of its coding region mediated 

15 by the Cauliflower Mosaic Virus 35S promoter and that 
termination of transcription plus polyadenylation are 
mediated by the nopaline synthase terminator region. It 
is further a feature of pDAB354-Notl that any DNA 
fragment containing flanking ATotl sites can be cloned 

20 into the unique NotI site of pDAB354-Notl, thus 

physically linking the introduced DNA fragment to the 
aforementioned selectable marker gene. 

To prepare a plant-expressible gene to produce the 
non-targeted TcdA protein in rice plant cells, DNA of a 

25 plasmid (pA0H_4-OPTI) containing the plant-optimized tcdA 
coding region, (SEQ ID No: 3) was cleaved with restriction 
enzymes Ncol and Sad, and the large 7550 bp fragment was 
ligated to similarly-cut DNA of plasmid pDAB1538 to 
produce plasmid pDAB1551. DNA of pDAB1551 was then 

30 digested with NotI, and the large 9933 bp fragment was 
ligated to AfotI digested DNA of pDAB354-Notl to produce 
plasmid pDAB1553. 

It is a feature of plasmid pDAB1553 that the ubil 
and 35S promoters are encoded on the same DNA strand. 

35 B. Production of Rice transgenics 
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For initiation of embryogenic callus, mature seeds 
of a Japonica cultivar, Taipei 309 were dehusked and 
surface-sterilized in 70% ethanol for 2-5 min. followed 
by a 30-45 min soak in 50% commercial bleach (2.6% sodium 
5 hypochlorite) with a few drops of 'Liquinox' soap. The 
seeds were then rinsed 3 times in sterile distilled water 
and placed on filter paper before transferring to 'callus 
induction 1 medium (i.e., NB) . The NB medium consisted of 
N6 macro elements (Chu, 1978, The N6 medium and its 

10 application to anther culture of cereal crops. Proc. 
Symp. Plant Tissue Culture, Peking Press, p43-56), B5 
micro elements and vitamins (Gamborg et al., 1968, 
Nutrient requirements of suspension cultures of soybean 
root cells. Exp. Cell Res. 50: 151-158), 300 mg/L casein 

15 hydrolysate, 500 mg/L L-proline, 500 mg/L L-glutamine, 30 
g/L sucrose, 2 mg/L 2, 4-dichloro-phenoxyacetic acid (2,4- 
D), and 2.5 g/L gelrite (Schweizerhall, NJ) with the pH 
adjusted to 5.8. The mature seed cultured on 'induction 1 
media were incubated in the dark at 28 °C. After 3 weeks 

20 of culture, the emerging primary callus induced from the 
scutellar region of mature embryo was transferred to 
fresh NB medium for further maintenance. 

About 140 pg of plasmid pDAB1553 DNA was 
precipitated onto 60 mg of 1.0 micron (Bio-Rad) gold 

25 particles as described herein. 

For helium blasting, actively growing embryogenic 
callus cultures, 2-4 mm in size, were subjected to a high 
osmoticum treatment. This treatment included placing of 
callus on NB medium with 0.2 M mannitol and 0.2 M 

30 sorbitol (Vain et al., 1993, Osmoticum treatment enhances 
particle bombardment-mediated transient and stable 
transformation of maize. Plant Cell Rep. 12: 84-88) for 
4 h before helium blasting. Following osmoticum 
treatment, callus cultures were transferred to 'blasting' 

35 medium (NB+2% agar) and covered with a stainless steel 

screen (230 micron) . The callus cultures were blasted at 
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2/000 psi helium pressures twice per target. After 
blasting, callus was transferred back to the media with 
high osmoticum overnight before placing on selection 
medium, which consisted NB medium with 30 mg/L 
5 hygromycin. After 2 weeks, the cultures were transferred 
to fresh selection medium with a higher concentration of 
selection agent, i.e., NB+50mg/L hygromycin (Li et al., 
1993, An improved rice transformation system using the 
biolistic method. Plant Cell Rep. 12: 250-255). 

10 Compact, white-yellow, embryogenic callus cultures, 

recovered on NB+50 mg/L hygromycin, were regenerated by 
transferring to f pre-regeneration' (PR) medium + 50 mg/L 
hygromycin. The PR medium consisted of NB medium with 2 
mg/L benzyl aminopurine (BAP), 1 mg/L naphthalene acetic 

15 acid (NAA) , and 5 mg/L abscisic acid (ABA) . After 2 
weeks of culture in the dark, they were transferred to 
Regeneration 1 (RN) medium . The composition of RN 
medium is NB medium with 3 mg/L BAP, and 0.5 mg/L NAA. 
The cultures on RN medium were incubated for 2 weeks at 

20 28° C under high fluorescent light (325-f t-candles) . The 
plantlets with 2 cm shoot were transferred to 1/2 MS 
medium (Murashige and Skoog, 1962, A revised medium for 
rapid growth and bioassays with tobacco tissue cultures. 
Physiol. Plant. 15: 473-497) with 1/2 B5 vitamins, 10 g/L 

25 sucrose, 0.05 mg/L NAA, 50 mg/L hygromycin and 2.5 g/L 
gelrite adjusted to pH 5.8 in magenta boxes. When 
plantlets were established with well-developed root 
systems, they were transferred to soil (1 metromix: 1 top 
soil) and raised in the greenhouse (29/24°C day/night 

30 cycle, 50-60% humidity, 12 h photoperiod) until maturity. 



EXAMPLE 7 

Chacterization Of Transgenic Rice Plants Expressing 
35 Photorhabdus Toxin That Confer Insect Control. 
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A. Insect bioassays 

Insect bioassays were performed using leaf discs and 
shown to be highly effective in controlling Southern corn 
rootworm. Diabrotica undecimpunctata howardi eggs are 
5 obtained from French Ag Research and hatched in petri 
dishes held at 28.5°C and 40% RH. The aerial parts are 
sampled from the transgenic plants and placed, singly 
into inverted petri dishes (100x15mm) containing 15ml of 
1.6% aqueous agar in the bottom to provide humidity and 

10 filter paper in the top to absorb condensation. These 
preparations are infested with five neonate larvae per 
dish and held at 28.5°C and 40% RH for 3 days. Mortality 
and larval weights are recorded. Weight data were 
transformed using a logarithmic function to correct a 

15 correlation between the magnitude of the mean and 
variance. 



Table 11 



Treatment 


Average Survivor 
Weight in mg 1 
(Duncan's 
Grouping) 


Presence TcdA greenhouse-grown 
plants (number of +/number of plants 
tested) 


GUS 
Control 


0,390 A 




1553-33 


0.170 BCD 


++ 


1553-44 


0.167 BCD 


+++ 


1553-62 


0.125 CD 


+++ 


1553-41 


0.100 D 


+++ 



Note: Means followed by the same letter are 
not significantly different based on Duncan' s 
20 multiple range test (alpha-0.05) . 

Insect groups weighing less than 0.1 mg were set to 0.03 mg 

instead of zero to conduct a more conservative analysis. 

Weight data were transformed (LoglO) for analyses. A single 

replicate was used on each of three test dates. Plants were 

25 sampled from magenta boxes. 

The results demonstrate that in leaf disc bioassays, several 

rice events derived by transformation with tcdA gene were 

demonstrated to statistically have a functional affect on 

corn rootworm neonate. 

30 
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Claims 

1. An isolated nucleic acid of SEQ ID NO: 3 or SEQ ID 
NO: 4. 

2. A transgenic monocot cell having a genome comprising 
5 SEQ ID NO: 3 or SEQ ID NO: 4. 

3. A transgenic dicot cell having a genome comprising 
SEQ ID NO: 3 or SEQ ID NO: 4. 

4 . A transgenic plant with a genome comprising a 
nucleic acid of SEQ ID NO: 3 or SEQ ID NO: 4 that imparts 

10 insect resistance. 

5. A transgenic plant of claim 4 wherein the plant is 
rice. 

6. A transgenic plant of claim 4 wherein the plant is 
maize* 

15 7. A transgenic plant of claim 4 wherein the plant is 
tobacco. 
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SEQUENCE LISTING 

<110> Petell, Jim 

Merlo, Donald 
Herman, Rod 
Roberts, Jean 
Guo, Lining 
Schafer, Barry 
Sukhapinda, Kitisri 
Owens Merlo, Ann 

<120> Transgenic Plants Expressing Photorhabdus Toxin 

<130> 50698 

<140> 
<141> 

<150> US 60/148,356 

<151> 1999-08-11 

<160> 8 

<170> Patentln Ver. 2.0 

<210> 1 

<211> 7551 

<212> DNA 

<213> Photorhabdus luminescens 

<220> 

<221> CDS 

<222> (1) . . (7548) 

<400> 1 

atg aac gag tct gta aaa gag ata cct gat gta tta aaa age cag tgt 48 

Met Asn Glu Ser Val Lys Glu lie Pro Asp Val Leu Lys Ser Gin Cys 
15 10 15 

ggt ttt aat tgt ctg aca gat att age cac age tct ttt aat gaa ttt 96 
Gly Phe Asn Cys Leu Thr Asp lie Ser His Ser Ser Phe Asn Glu Phe 
20 . 25 30 

cgc cag caa gta tct gag cac etc tec tgg tec gaa aca cac gac tta 144 
Arg Gin Gin Val Ser Glu His Leu Ser Trp Ser Glu Thr His Asp Leu 
35 40 45 

tat cat gat gca caa cag gca caa aag gat aat cgc ctg tat gaa gcg 192 
Tyr His Asp Ala Gin Gin Ala Gin Lys Asp Asn Arg Leu Tyr Glu Ala 
50 55 60 

cgt att etc aaa cgc gec aat ccc caa tta caa aat gcg gtg cat ctt 240 
Arg lie Leu Lys Arg Ala Asn Pro Gin Leu Gin Asn Ala Val His Leu 
65 70 75 80 

gec att etc get ccc aat get gaa ctg ata ggc tat aac aat caa ttt 288 
Ala lie Leu Ala Pro Asn Ala Glu Leu He Gly Tyr Asn Asn Gin Phe 
85 90 95 

age ggt aga gec agt caa tat gtt gcg ccg ggt ace gtt tct tec atg 336 
Ser Gly Arg Ala Ser Gin Tyr Val Ala Pro Gly Thr Val Ser Ser Met 
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100 105 110 

ttc tec ccc gec get tat ttg act gaa ctt tat cgt gaa gca cgc aat 384 

Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Arg Asn 
115 120 125 

tta cac gca agt gac tec gtt tat tat ctg gat acc cgc cgc cca gat 432 

Leu His Ala Ser Asp Ser Val Tyr Tyr Leu Asp Thr Arg Arg Pro Asp 

130 135 140 

etc aaa tea atg gcg etc agt cag caa aat atg gat ata gaa tta tec 480 

Leu Lys Ser Met Ala Leu Ser Gin Gin Asn Met Asp lie Glu Leu Ser 

145 150 155 160 

aca etc tct ttg tec aat gag ctg tta ttg gaa age att aaa act ga3 528 

Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser He Lys Thr Glu 
165 170 175 

tct aaa ctg gaa aac tat act aaa gtg atg gaa atg etc tec act ttc 576 

Ser Lys Leu Glu Asn Tyr Thr Lys Val Met Glu Met Leu Ser Thr Phe 
180 185 190 

cgt cct tec ggc gca acg cct tat cat gat get tat gaa aat gtg cgt 624 

Arg Pro Ser Gly Ala Thr Pro Tyr His Asp Ala Tyr Glu Asn Val Arg 
195 200 205 

gaa gtt ate cag eta caa gat cct gga ctt gag caa etc aat gca tea 672 

Glu Val He Gin Leu Gin Asp Pro Gly Leu Glu Gin Leu Asn Ala Ser 

210 215 " 220 

ccg gca att gee ggg ttg atg cat caa gee tec eta ttg ggt att aac 720 

Pro Ala lie Ala Gly Leu Met His Gin Ala Ser Leu Leu Gly He Asn 

225 230 235 240 

get tea ate teg cct gag eta ttt aat att ctg acg gag gag att acc 768 

Ala Ser He Ser Pro Glu Leu Phe Asn He Leu Thr Glu Glu lie Thr 
245 250 255 

gaa ggt aat get gag gaa ctt tat aag aaa aat ttt ggt aat ate gaa 816 

Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Gly Asn lie Glu 
260 265 270 

ccg gec tea ttg get atg ccg gaa tac ctt aaa cgt tat tat aat tta 864 

Pro Ala Ser Leu Ala Met Pro Glu Tyr Leu Lys Arg Tyr Tyr Asn Leu 
275 280 285 

age gat gaa gaa ctt agt cag ttt att ggt aaa gec age aat ttt ggt 912 

Ser Asp Glu Glu Leu Ser Gin Phe lie Gly Lys Ala Ser Asn Phe Gly 

290 295 300 

caa cag gaa tat agt aat aac caa ctt att act ccg gta gtc aac age 960 

Gin Gin Glu Tyr Ser Asn Asn Gin Leu He Thr Pro Val Val Asn Ser 

305 310 315 320 

agt gat ggc acg gtt aag gta tat egg ate acc cgc gaa tat aca acc 1008 

Ser Asp Gly Thr Val Lys Val Tyr Arg lie Thr Arg Glu Tyr Thr Thr 
325 330 335 

aat get tat caa atg gat gtg gag eta ttt ccc ttc ggt ggt gag aat 1056 

Asn Ala Tyr Gin Met Asp Val Glu Leu Phe Pro Phe Gly Gly Glu Asn 
340 345 350 
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tat egg tta gat tat aaa ttc aaa aat ttt tat aat gec tct tat tta 1104 

Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser Tyr Leu 
355 360 365 

tec ate aag tta aat gat aaa aga gaa ctt gtt cga act gaa ggc get 1152 

Ser lie Lys Leu Asn Asp Lys Arg Glu Leu Val Arg Thr Glu Gly Ala 
370 *" 375 380 

cct caa gtc aat ata gaa tac tec gca aat ate aca tta aat ace get 1200 

Pro Gin Val Asn lie Glu Tyr Ser Ala Asn He Thr Leu Asn Thr Ala 
385 390 395 400 

gat ate agt caa cct ttt gaa att ggc ctg aca cga gta ctt cct tec 1248 

Asp He Ser Gin Pro Phe Glu He Gly Leu Thr Arg Val Leu Pro Ser 

405 410 415 

ggt tct tgg gca tat gec gec gca aaa ttt acc gtt gaa gag tat aac 1296 

Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu Tyr Asn 
420 425 430 

caa tac tct ttt ctg eta aaa ctt aac aag get att cgt eta tea cgt 1344 

Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys Ala He Arg Leu Ser Arg 
435 440 445 

gcg aca gaa ttg tea ccc acg att ctg gaa ggc att gtg cgc agt gtt 1392 

Ala Thr Glu Leu Ser Pro Thr He Leu Glu Gly He Val Arg Ser Val 
450 455 460 

aat eta caa ctg gat ate aac aca gac gta tta ggt aaa gtt ttt ctg 1440 

Asn Leu Gin Leu Asp He Asn Thr Asp Val Leu Gly Lys Val Phe Leu 
465 470 475 480 

act aaa tat tat atg cag cgt tat get att cat get gaa act gee ctg 1488 

Thr Lys Tyr Tyr Met Gin Arg Tyr Ala He His Ala Glu Thr Ala Leu 

485 490 495 

ata eta tgc aac gcg cct att tea caa cgt tea tat gat aat caa cct 1536 

He Leu Cys Asn Ala Pro He Ser Gin Arg Ser Tyr Asp Asn Gin Pro 
500 505 510 

age caa ttt gat cgc ctg ttt aat acg cca tta ctg aac gga caa tat 1584 

Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Gly Gin Tyr 
515 520 525 

ttt tct acc ggc gat gag gag att gat tta aat tea ggt age acc ggc 1632 

Phe Ser Thr Gly Asp Glu Glu He Asp Leu Asn Ser Gly Ser Thr Gly 
530 535 540 

gat tgg cga aaa acc ata ctt aag cgt gca ttt aat att gat gat gtc 1680 

Asp Trp Arg Lys Thr He Leu Lys Arg Ala Phe Asn He Asp Asp Val 
545 550 555 560 

teg etc ttc cgc ctg ctt aaa att acc gac cat gat aat aaa gat gga 1728 

Ser Leu Phe Arg Leu Leu Lys He Thr Asp His Asp Asn Lys Asp Gly 

565 570 575 

aaa att aaa aat aac eta aag aat ctt tec aat tta tat att gga aaa 1776 

Lys He Lys Asn Asn Leu Lys Asn L u Ser Asn Leu Tyr He Gly Lys 
580 585 590 
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tta ctg gca gat att cat caa tta acc att gat gaa ctg gat tta tta 1824 

Leu Leu Ala Asp lie His Gin Leu Thr lie Asp Glu Leu Asp Leu Leu 

595 600 605 

ctg att gcc gta ggt gaa gga aaa act aat tta tec get ate agt gat 1872 

Leu lie Ala Val Gly Glu Gly Lys Thr Asn Leu Ser Ala He Ser Asp 
610 " 615 620 

aag caa ttg get acc ctg ate aga aaa etc aat act att acc age tgg 1920 

Lys Gin Leu Ala Thr Leu He Arg Lys Leu Asn Thr He Thr Ser Trp 

625 630 635 640 

eta cat aca cag aag tgg agt gta ttc cag eta ttt ate atg acc tec 1968 

Leu His Thr Gin Lys Trp Ser Val Phe Gin Leu Phe He Met Thr Ser 
645 650 655 

acc age tat aac aaa acg eta acg cct gaa att aag aat ttg ctg gat 2016 

Thr Ser Tyr Asn Lys Thr Leu Thr Pro Glu He Lys Asn Leu Leu Asp 

660 665 670 

acc gtc tac cac ggt tta caa ggt ttt gat aaa gac aaa gca gat ttg 2064 

Thr Val Tyr His Gly Leu Gin Gly Phe Asp Lys Asp Lys Ala Asp Leu 

675 680 685 

eta cat gtc atg gcg ccc tat att gcg gcc acc ttg caa tta tea teg 2112 

Leu His Val Met Ala Pro Tyr He Ala Ala Thr Leu Gin Leu Ser Ser 
690 695 700 

gaa aat gtc gcc cac teg gta etc ctt tgg gca gat aag tta cag ccc 2160 

Glu Asn Val Ala His Ser Val Leu Leu Trp Ala Asp Lys Leu Gin Pro 

705 710 715 720 

ggc gac ggc gca atg aca gca gaa aaa ttc tgg gac tgg ttg aat act 2208 

Gly Asp Gly Ala Met Thr Ala Glu Lys Phe Trp Asp Trp Leu Asn Thr 
725 730 735 

aag tat acg ccg ggt tea teg gaa gcc gta gaa acg cag gaa cat ate 2256 

Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gin Glu His He 

740 745 750 

gtt cag tat tgt cag get ctg gca caa ttg gaa atg gtt tac cat tec 2304 

Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu Glu Met Val Tyr His Ser 

755 760 765 

acc ggc ate aac gaa aac gcc ttc cgt eta ttt gtg aca aaa cca gag 2352 

Thr Gly He Asn Glu Asn Ala Phe Arg Leu Phe Val Thr Lys Pro Glu 
770 775 780 

atg ttt ggc get gca act gga gca gcg ccc gcg cat gat gcc ctt tea 2400 

Met Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala Leu Ser 

785 790 795 800 

ctg att atg ctg aca cgt ttt gcg gat tgg gtg aac gca eta ggc gaa 2448 

Leu He Met Leu Thr Arg Phe Ala Asp Trp Val Asn Ala Leu Gly Glu 
805 810 815 

aaa gcg tec teg gtg eta gcg gca ttt gaa get aac teg tta acg gca 2496 

Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu Thr Ala 

820 825 830 

gaa caa ctg get gat gcc atg aat ctt gat get aat ttg ctg ttg caa 2544 
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Glu Gin Leu Ala Asp Ala Met Asn Leu Asp Ala Asn Leu Leu Leu Gin 
835 840 845 

gcc agt att caa gca caa aat cat caa cat ctt ccc cca gta act cca 2592 
Ala Ser He Gin Ala Gin Asn His Gin His Leu Pro Pro Val Thr Pro 
850 855 860 

gaa aat gcg ttc tec tgt tgg aca tct ate aat act ate ctg caa tgg 2640 
Glu Asn Ala Phe Ser Cys Trp Thr Ser He Asn Thr He Leu Gin Trp 
865 870 875 880 

gtt aat gtc gca caa caa ttg aat gtc gcc cca cag ggc gtt tec get 2688 
Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val Ser Ala 
885 890 895 

ttg gtc ggg ctg gat tat att caa tea atg aaa gag aca ccg acc tat 2736 
Leu Val Gly Leu Asp Tyr He Gin Ser Met Lys Glu Thr Pro Thr Tyr 
900 905 910 

gcc cag tgg gaa aac gcg gca ggc gta tta acc gcc ggg ttg aat tea 2784 
Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu Asn Ser 
915 920 925 

caa cag get aat aca tta cac get ttt ctg gat gaa tct cgc agt gcc 2832 
Gin Gin Ala Asn Thr Leu, His Ala Phe Leu Asp Glu Ser Arg Ser Ala 
930 935 940 

gca tta age acc tac tat ate cgt caa gtc gcc aag gca gcg gcg get 2880 
Ala Leu Ser Thr Tyr Tyr He Arg Gin Val Ala Lys Ala Ala Ala Ala 
945 950 955 960 

att aaa age cgt gat gac ttg tat caa tac tta ctg att gat aat cag 2928 
He Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu He Asp Asn Gin 
965 970 975 

gtt tct gcg gca at a aaa acc acc egg ate gcc gaa gcc att gcc agt 2976 
Val Ser Ala Ala He Lys Thr Thr Arg He Ala Glu Ala He Ala Ser 
980 985 990 

att caa ctg tac gtc aac egg gca ttg gaa aat gtg gaa gaa aat gcc 3024 
lie Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala 
995 1000 1005 

aat teg ggg gtt ate age cgc caa ttc ttt ate gac tgg gac aaa tac 3072 
Asn Ser Gly Val He Ser Arg Gin Phe Phe He Asp Trp Asp Lys Tyr 
1010 1015 1020 

aat aaa cgc tac age act tgg gcg ggt gtt tct caa tta gtt tac tac 3120 
Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tyr 
1025 1030 1035 1040 

ccg gaa aac tat att gat ccg acc atg cgt ate gga caa acc aaa atg 3168 
Pro Glu Asn Tyr lie Asp Pro Thr Met Arg lie Gly Gin Thr Lys Met 
1045 1050 1055 

atg gac gca tta ctg caa tec gtc age caa age caa tta aac gcc gat 3216 
Met Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 
1060 1065 1070 

acc gtc gaa gat gcc ttt atg tct tat ctg aca teg ttt gaa caa gtg 3264 
Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu Gin Val 
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1075 1080 1085 

get aat ctt aaa gtt att age gca tat cac gat aat att aat aac gat 3312 
Ala Asn Leu Lys Val lie Ser Ala Tyr His Asp Asn lie Asn Asn Asp 
1090 1095 1100 

caa ggg ctg acc tat ttt ate gga etc agt gaa act gat gee ggt gaa 3360 
Gin Gly Leu Thr Tyr Phe lie Gly Leu Ser Glu Thr Asp Ala Gly Glu 
1105 1110 1115 1120 

tat tat tgg cgc agt gtc gat cac agt aaa ttc aac gac ggt aaa ttc 3408 
Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly Lys Phe 
1125 1130 1135 

gcg get aat gee tgg agt gaa tgg cat aaa att gat tgt cca att aac 3456 
Ala Ala Asn Ala Trp Ser Glu Trp His Lys He Asp Cys Pro He Asn 
1140 1145 1150 

cct tat aaa age act ate cgt cca gtg ata tat aaa tec cgc ctg tat 3504 
Pro Tyr Lys Ser Thr lie Arg Pro Val He Tyr Lys Ser Arg Leu Tyr 
1155 1160 1165 

ctg etc tgg ttg gaa caa aag gag ate acc aaa cag aca gga aat agt 3552 
Leu Leu Trp Leu Glu Gin Lys Glu He Thr Lys Gin Thr Gly Asn Ser 
1170 * 1175 1180 

aaa gat ggc tat caa act gaa acg gat tat cgt tat gaa eta aaa ttg 3600 
Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 
1185 * 1190 1195 1200 

gcg cat ate cgc tat gat ggc act tgg aat acg cca ate acc ttt gat 3648 
Ala His lie Arg Tyr Asp Gly Thr Trp Asn Thr Pro He Thr Phe Asp 
1205 1210 1215 

gtc aat aaa aaa ata tec gag eta aaa ctg gaa aaa aat aga gcg ccc 3696 
Val Asn Lys Lys He Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 
1220 1225 1230 

gga etc tat tgt gee ggt tat caa ggt gaa gat acg ttg ctg gtg atg 3744 
Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met 
1235 . 1240 1245 

ttt tat aac caa caa gac aca eta gat agt tat aaa aac get tea atg 3792 
Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser Tyr Lys Asn Ala Ser Met 
1250 1255 1260 

caa gga eta tat ate ttt get gat atg gca tec aaa gat atg acc cca 3840 
Gin Gly Leu Tyr He Phe Ala Asp Met Ala Ser Lys Asp Met Thr Pro 
1265 1270 1275 1280 

gaa cag age aat gtt tat egg gat aat age tat caa caa ttt gat acc 3888 
Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 
1285 1290 1295 

aat aat gtc aga aga gtg aat aac cgc tat gca gag gat tat gag att 3936 
Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu lie 
1300 1305 1310 

cct tec teg gta agt age cgt aaa gac tat ggt tgg gga gat tat tac 3984 
Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 
1315 1320 1325 
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etc age atg gta tat aac gga gat att cca act ate aat tac aaa gec 4032 
Leu Ser Met Val Tyr Asn Gly Asp lie Pro Thr lie Asn Tyr Lys Ala 
1330 1335 1340 

gca tea agt gat tta aaa ate tat ate tea cca aaa tta aga att att 4080 
Ala Ser Ser Asp Leu Lys lie Tyr He Ser Pro Lys Leu Arg He He 
1345 1350 1355 1360 

cat aat gga tat gaa gga cag aag cgc aat caa tgc aat ctg atg aat 4128 
His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Met Asn 
1365 1370 1375 

aaa tat ggc aaa eta ggt gat aaa ttt att gtt tat act age ttg ggg 4176 
Lys Tyr Gly Lys Leu Gly Asp Lys Phe He Val Tyr Thr Ser Leu Gly 
1380 1385 1390 

gtc aat cca aat aac teg tea aat aag etc atg ttt tac ccc gtc tat 4224 
Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val Tyr 
1395 1400 1405 

caa tat age gga aac ace agt gga etc aat caa ggg aga eta eta ttc 4272 
Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 
1410 1415 1420 

cac cgt gac acc act tat cca tct aaa gta gaa get tgg att cct gga 4320 
His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp He Pro Gly 
1425 1430 1435 1440 

gca aaa cgt tct eta acc aac caa aat gee gee att ggt gat gat tat 4368 
Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala He Gly Asp Asp Tyr 
1445 1450 1455 

get aca gac tct ctg aat aaa ccg gat gat ctt aag caa tat ate ttt 4416 
Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr He Phe 
1460 1465 1470 

atg act gac agt aaa ggg act get act gat gtc tea ggc cca gta gag 4464 
Met Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu 
1475 1480 , 1485 

att aat act gca att tct cca gca aaa gtt cag ata ata gtc aaa gcg 4512 
He Asn Thr Ala He Ser Pro Ala Lys Val Gin He He Val Lys Ala 
1490 1495 1500 

ggt ggc aag gag caa act ttt acc gca gat aaa gat gtc tec att cag 4560 
Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser He Gin 
1505 1510 1515 1520 

cca tea cct age ttt gat gaa atg aat tat caa ttt aat gec ctt gaa 4 608 
Pro Ser Pro Ser Phe Asp Glu Met Asn Tyr Gin Phe Asn Ala Leu Glu 
1525 1530 1535 

ata gac ggt tct ggt ctg aat ttt att aac aac tea gec agt att gat 4656 
He Asp Gly Ser Gly Leu Asn Phe He Asn Asn Ser Ala Ser He Asp 
1540 1545 1550 

gtt act ttt acc gca ttt gcg gag gat ggc cgc aaa ctg ggt tat gaa 4704 
Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly Tyr Glu 
1555 1560 1565 
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agt ttc agt att cct gtt acc etc aag gta agt acc gat aat gec ctg 4752 

Ser Phe Ser lie Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 

1570 1575 1580 

acc ctg cac cat aat gaa aat ggt gcg caa tat atg caa tgg caa tec 4800 

Thr Leu His His Asn Glu Asn Gly Ala Gin Tyr Met Gin Trp Gin Ser 

1585 1590 1595 1600 

tat cgt acc cgc ctg aat act eta ttt gec cgc cag ttg gtt gca cgc 4848 

Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Arg 

1605 1610 1615 

gec acc acc gga ate gat aca att ctg agt atg gaa act cag aat att 4896 

Ala Thr Thr Gly He Asp Thr He Leu Ser Met Glu Thr Gin Asn lie 

1620 1625 1630 

cag gaa ccg cag tta ggc aaa ggt ttc tat get acg ttc gtg ata cct 4944 

Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val He Pro 

1635 " 1640 1645 

ccc tat aac eta tea act cat ggt gat gaa cgt tgg ttt aag ctt tat 4 992 

Pro Tyr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys Leu Tyr 

1650 1655 1660 

ate aaa cat gtt gtt gat aat aat tea cat att ate tat tea ggc cag 5040 

He Lys His Val Val Asp Asn Asn Ser His He He Tyr Ser Gly Gin 

1665 1670 1675 1680 

eta aca gat aca aat ata aac ate aca tta ttt att cct ctt gat gat 5088 

Leu Thr Asp Thr Asn He Asn He Thr Leu Phe He Pro Leu Asp Asp 

1685 1690 1695 

gtc cca ttg aat caa gat tat cac gee aag gtt tat atg acc ttc aag 5136 

Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr Phe Lys 

1700 1705 1710 

aaa tea cca tea gat ggt acc tgg tgg ggc cct cac ttt gtt aga gat 5184 

Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asp 

1715 1720 1725 

gat aaa gga ata gta aca ata aac cct aaa tec att ttg acc cat ttt 5232 

Asp Lys Gly He Val Thr lie Asn Pro Lys Ser lie Leu Thr His Phe 

1730 1735 1740 

gag age gtc aat gtc ctg aat aat att agt age gaa cca atg gat ttc 5280 

Glu Ser Val Asn Val Leu Asn Asn He Ser Ser Glu Pro Met Asp Phe 

1745 1750 1755 1760 

age ggc get aac age etc tat ttc tgg gaa ctg ttc tac tat acc ccg 5328 

Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 

1765 1770 1775 

atg ctg gtt get caa cgt ttg ctg cat gaa cag aac ttc gat gaa gee 5376 

Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp Glu Ala 

1780 1785 1790 

aac cgt tgg ctg aaa tat gtc tgg agt cca tec ggt tat att gtc cac 5424 

Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr He Val His 

1795 ' * 1800 1805 

ggc cag att cag aac tac cag tgg aac gtc cgc ccg tta ctg gaa gac 5472 
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Gly Gin lie Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu Glu Asp 
1810 1815 1820 

acc agt tgg aac agt gat cct ttg gat tec gtc gat cct gac gcg gta 5520 
Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 
1825 1830 1835 1840 

gca cag cac gat cca atg cac tac aaa gtt tea act ttt atg cgt acc 5568 
Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met Arg Thr 
1845 1850 1855 

ttg gat eta ttg ata gca cgc ggc gac cat get tat cgc caa ctg gaa 5616 
Leu Asp Leu Leu He Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 
1860 1865 1870 

cga gat aca etc aac gaa gcg aag atg tgg tat atg caa gcg ctg cat 5664 
Arg Asp Thr Leu Asn Glu Ala Lys Met Trp Tyr Met Gin Ala Leu His 
1875 1880 1885 

eta tta ggt gac aaa cct tat eta ccg ctg agt acg aca tgg agt gat 5712 
Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Trp Ser Asp 
1890 1895 1900 

cca cga eta gac aga gee gcg gat ate act acc caa aat get cac gac 5760 
Pro Arg Leu Asp Arg Ala Ala Asp He Thr Thr Gin Asn Ala His Asp 
1905 1910 1915 1920 

age gca ata gtc get ctg egg cag aat ata cct aca ccg gca cct tta 5808 
Ser Ala He Val Ala Leu Arg Gin Asn He Pro Thr Pro Ala Pro Leu 
1925 1930 1935 

tea ttg cgc age get aat acc ctg act gat etc ttc ctg ccg caa ate 5856 
Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin He 
1940 1945 1950 

aat gaa gtg atg atg aat tac tgg cag aca tta get cag aga gta tac 5904 
Asn Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr 
1955 1960 1965 

aat ctg cgt cat aac etc tct ate gac ggc cag ccg tta tat ctg cca 5952 
Asn Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr Leu Pro 
1970 1975 1980 

ate tat gee aca ccg gee gat ccg aaa gcg tta etc age gec gec gtt 6000 
He Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 
1985 1990 1995 2000 

gec act tct caa ggt gga ggc aag eta ccg gaa tea ttt atg tec ctg 6048 
Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu 
2005 2010 2015 



tgg cgt ttc ccg cac atg ctg gaa aat gcg cgc ggc atg gtt age cag 6096 
Trp Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin 
2020 2025 2030 

etc acc cag ttc ggc tec acg tta caa aat att ate gaa cgt cag gac 6144 
Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn He He Glu Arg Gin Asp 
2035 2040 2045 

gcg gaa gcg etc aat gcg tta tta caa aat cag gee gee gag ctg ata 6192 
Ala Giu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu II 
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2050 2055 2060 

ttg act aac ctg age att cag gac aaa acc att gaa gaa ttg gat gec 6240 
L u Thr Asn Leu S r He Gin Asp Lys Thr He Glu Glu Leu Asp Ala 
2065 2070 2075 2080 

gag aaa acg gtg ttg gaa aaa tec aaa gcg gga gca caa teg cgc ttt 6288 
Glu Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe 
2085 2090 2095 

gat age tac ggc aaa ctg tac gat gag aat ate aac gec ggt gaa aac 6336 
Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn 
2100 * 2105 2110 

caa gec atg acg eta cga gcg tec gec gee ggg ctt acc acg gca gtt 6384 
Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 
2115 2120 2125 

cag gca tec cgt ctg gee ggt gcg gcg get gat ctg gtg cct aac ate 6432 
Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He 
2130 2135 2140 

ttc ggc ttt gee ggt ggc ggc age cgt tgg ggg get ate get gag gcg 6480 
Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala 
2145 2150 2155 2160 

aca ggt tat gtg atg gaa ttc tec gcg aat gtt atg aac acc gaa gcg 6528 
Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 
2165 2170 2175 

gat aaa att age caa tct gaa acc tac cgt cgt cgc cgt cag gag tgg 6576 
Asp Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp 
2180 2185 2190 

gag ate cag egg aat aat gec gaa gcg gaa ttg aag caa ate gat get 6624 
Glu He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala 
2195 2200 2205 

cag etc aaa tea etc get gta cgc cgc gaa gee gec gta ttg cag aaa 6672 
Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 
2210 2215 2220 

acc agt ctg aaa acc caa caa gaa cag acc caa tct caa ttg gee ttc 6720 
Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ma Phe 
2225 2230 2235 2240 

ctg caa cgt aag ttc age aat cag gcg tta tac aac tgg ctg cgt ggt 6768 
Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 
2245 2250 2255 

cga ctg gcg gcg att tac ttc cag ttc tac gat ttg gee gtc gcg cgt 6816 
Arg Leu Ala Ala lie Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 
2260 2265 2270 

tgc ctg atg gca gaa caa get tac cgt tgg gaa etc aat gat gac tct 6864 
Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser 
2275 2280 2285 

gec cgc ttc att aaa ccg ggc gee tgg cag gga acc tat gee ggt ctg 6912 
Ala Arg Phe He Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 
2290 2295 2300 
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ctt gca ggt gaa acc ttg atg ctg agt ctg gca caa atg gaa gac get 6960 
Leu Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala 
2305 2310 2315 2320 

cat ctg aaa cgc gat aaa cgc gca tta gag gtt gaa cgc aca gta teg 7008 
His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser 
2325 2330 2335 

ctg gec gaa gtt tat gca gga tta cca aaa gat aac ggt cca ttt tec 7056 
Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 
2340 2345 2350 

ctg get cag gaa att gac aag ctg gtg agt caa ggt tea ggc agt gec 7104 
Leu Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala 
2355 2360 2365 

ggc agt ggt aat aat aat ttg gcg ttc ggc gec ggc acg gac act aaa 7152 
Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys 
2370 2375 2380 

acc tct ttg cag gca tea gtt tea ttc get gat ttg aaa att cgt gaa 7200 
Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He Arg Glu 
2385 2390 2395 2400 

gat tac ccg gca teg ctt ggc aaa att cga cgt ate aaa cag ate age 7248 
Asp Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin He Ser 
2405 2410 2415 

gtc act ttg ccc gcg eta ctg gga ccg tat cag gat gta cag gca ata 7296 
Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He 
2420 2425 2430 

ttg tct tac ggc gat aaa gec gga tta get aac ggc tgt gaa gcg ctg 7344 
Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu 
2435 2440 2445 

gca gtt tct cac ggt atg aat gac age ggc caa ttc cag etc gat ttc 7392 
Ala Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe 
2450 2455 2460 

aac gat ggc aaa ttc ctg cca ttc gaa ggc ate gec att gat caa ggc 7440 
Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He Asp Gin Gly 
2465 2470 2475 2480 

acg ctg aca ctg age ttc cca aat gca tct atg ccg gag aaa ggt aaa 7488 
Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys 
2485 2490 2495 

caa gec act atg tta aaa acc ctg aac gat ate att ttg cat att cgc 7536 
Gin Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg 
2500 2505 2510 

tac acc att aaa taa 7551 
Tyr Thr He Lys 
2515 



<210> 2 
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<211> 7515 
<212> DNA 

<213> Photorhabdus luminescens 

<220> 

<221> CDS 

<222> (1) . . (7512) 

<400> 2 

atg caa aac tea tta tea age act ate gat act att tgt cag aaa ctg 48 

Met Gin Asn Ser Leu Ser Ser Thr lie Asp Thr He Cys Gin Lys Leu 
15 10 15 

caa tta act tgt ccg gcg gaa att get ttg tat ccc ttt gat act ttc 96 
Gin Leu Thr Cys Pro Ala Glu He Ala Leu Tyr Pro Phe Asp Thr Phe 
20 25 30 

egg gaa aaa act egg gga atg gtt aat tgg ggg gaa gca aaa egg att 144 
Arg Glu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys Arg lie 
35 40 45 

tat gaa att gca caa gcg gaa cag gat aga aac eta ctt cat gaa aaa 192 
Tyr Glu He Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His Glu Lys 
50 55 60 

cgt att ttt gee tat get aat ccg ctg ctg aaa aac get gtt egg ttg 240 
Arg He Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu 
65 70 75 80 

ggt ace egg caa atg ttg ggt ttt ata caa ggt tat agt gat ctg ttt 288 
Gly Thr Arg Gin Met Leu Gly Phe He Gin Gly Tyr Ser Asp Leu Phe 
85 90 95 

ggt aat cgt get gat aac tat gee gcg ccg ggc teg gtt gca teg atg 336 
Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala Ser Met 
100 105 110 

ttc tea ccg gcg get tat ttg acg gaa ttg tac cgt gaa gee aaa aac 384 
Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asn 
115 120 125 

ttg cat gac age age tea att tat tac eta gat aaa cgt cgc ccg gat 432 
Leu His Asp Ser Ser Ser He Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 
130 135 140 

tta gca age tta atg etc age cag aaa aat atg gat gag gaa att tea 480 
Leu Ala Ser Leu Met Leu Ser Gin Lys Asn Met Asp Glu Glu He Ser 
145 150 155 160 

acg ctg get etc tct aat gaa ttg tgc ctt gee ggg ate gaa aca aaa 528 
Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu Thr Lys 
165 170 175 

aca gga aaa tea caa gat gaa gtg atg gat atg ttg tea act tat cgt 576 
Thr Gly Lys Ser Gin Asp Glu Val Met Asp Met Leu Ser Thr Tyr Arg 
180 185 190 

tta agt gga gag aca cct tat cat cac get tat gaa act gtt cgt gaa 624 
Leu Ser Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Val Arg Glu 
195 200 205 
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ate gtt cat gaa cgt gat cca gga ttt cgt cat ttg tea cag gca ccc 672 

lie Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 
210 215 220 

att gtt get get aag etc gat cct gtg act ttg ttg ggt att age tec 720 

lie Val Ala Ala Lys Leu Asp Pro, Val Thr Leu Leu Gly lie Ser Ser 

225 230 235 240 

cat att teg cca gaa ctg tat aac ttg ctg att gag gag ate ccg gaa 768 

His He Ser Pro Glu Leu Tyr Asn Leu Leu He Glu Glu He Pro Glu 
245 250 255 

aaa gat gaa gee gcg ctt gat acg ctt tat aaa aca aac ttt ggc gat 816 

Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp 
260 265 270 

att act act get cag tta atg tec cca agt tat ctg gec egg tat tat 864 

He Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr 
275 280 285 

ggc gtc tea ccg gaa gat att gee tac gtg acg act tea tta tea cat 912 

Gly Val Ser Pro Glu Asp He Ala Tyr Val Thr Thr Ser Leu Ser His 
290 295 300 

gtt gga tat age agt gat att ctg gtt att ccg ttg gtc gat ggt gtg 960 

Val Gly Tyr Ser Ser Asp lie Leu Val He Pro Leu Val Asp Gly Val 

305 310 315 320 

ggt aag atg gaa gta gtt cgt gtt ace cga aca cca teg gat aat tat 1008 

Gly Lys Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr 
325 330 335 

acc agt cag acg aat tat att gag ctg tat cca cag ggt ggc gac aat 1056 

Thr Ser Gin Thr Asn Tyr He Glu Leu Tyr Pro Gin Gly Gly Asp Asn 
340 345 350 

tat ttg ate aaa tac aat eta age aat agt ttt ggt ttg gat gat ttt 1104 

Tyr Leu He Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp Asp Phe 
355 " 360 365 

tat ctg caa tat aaa gat ggt tec get gat tgg act gag att gee cat 1152 

Tyr Leu Gin Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu He Ala His 
370 375 380 

aat ccc tat cct gat atg gtc ata aat caa aag tat gaa tea cag gcg 1200 

Asn Pro Tyr Pro Asp Met Val He Asn Gin Lys Tyr Glu Ser Gin Ala 

385 390 395 400 

aca ate aaa cgt agt gac tct gac aat ata etc agt ata ggg tta caa 1248 
Thr He Lys Arg Ser Asp Ser Asp Asn He Leu Ser He Gly Leu Gin 
405 410 415 

aga tgg cat age ggt agt tat aat ttt gec gee gee aat ttt aaa att 1296 
Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe Lys He 
420 425 430 

gac caa tac tec ccg aaa get ttc ctg ctt aaa atg aat aag get att 1344 

Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys Ala He 
435 440 445 

egg ttg etc aaa get acc ggc etc tct ttt get acg ttg gag cgt att 1392 
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Arg Leu Leu Lys Ala Thr Gly Leu S r Phe Ala Thr Leu Glu Arg He 
450 455 460 

gtt gat agt gtt aat age acc aaa tec ate acg gtt gag gta tta aac 1440 
Val Asp Ser Val Asn Ser Thr Lys Ser He Thr Val Glu Val Leu Asn 
465 470 475 480 

aag gtt tat egg gta aaa ttc tat att gat cgt tat ggc ate agt gaa 1488 
Lys Val Tyr Arg Val Lys Phe Tyr He Asp Arg Tyr Gly He Ser Glu 
485 490 495 

gag aca gee get att ttg get aat att aat ate tct cag caa get gtt 1536 
Glu Thr Ala Ala He Leu Ala Asn He Asn He Ser Gin Gin Ala Val 
500 505 510 

ggc aat cag ctt age cag ttt gag caa eta ttt aat cac ccg ccg etc 1584 
Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro Pro Leu 
515 520 525 

aat ggt att cgc tat gaa ate agt gag gac aac tec aaa cat ctt cct 1632 
Asn Gly He Arg Tyr Glu He Ser Glu Asp Asn Ser Lys His Leu Pro 
530 535 540 

aat cct gat ctg aac ctt aaa cca gac agt acc ggt gat gat caa cgc 1680 
Asn Pro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp Gin Arg 
545 550 555 560 

aag gcg gtt tta aaa cgc gcg ttt cag gtt aac gee agt gag ttg tat 1728 
Lys Ala Val Leu Lys Arg Ala Phe Gin Val Asn Ala Ser Glu Leu Tyr 
565 570 575 

cag atg tta ttg ate act gat cgt aaa gaa gac ggt gtt ate aaa aat 1776 
Gin Met Leu Leu He Thr Asp Arg Lys Glu Asp Gly Val He Lys Asn 
580 585 590 

aac tta gag aat ttg tct gat ctg tat ttg gtt agt ttg ctg gec cag 1824 
Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Val Ser Leu Leu Ala Gin 
595 600 605 

att cat aac ctg act att get gaa ttg aac att ttg ttg gtg att tgt 1872 
He His Asn Leu Thr He Ala Glu Leu Asn He Leu Leu Val He Cys 
610 615 620 

ggc tat ggc gac acc aac att tat cag att acc gac gat aat tta gee 1920 
Gly Tyr Gly Asp Thr Asn He Tyr Gin He Thr Asp Asp Asn Leu Ala 
625 630 635 640 

aaa ata gtg gaa aca ttg ttg tgg ate act caa tgg ttg aag acc caa 1968 
Lys lie Val Glu Thr Leu Leu Trp He Thr Gin Trp Leu Lys Thr Gin 
645 650 655 

aaa tgg aca gtt acc gac ctg ttt ctg atg acc acg gee act tac age 2016 
Lys Trp Thr Val Thr Asp Leu Phe Leu Met Thr Thr Ala Thr Tyr Ser 
660 665 670 

acc act tta acg cca gaa att age aat ctg acg get acg ttg tct tea 2064 
Thr Thr Leu Thr Pro Glu He Ser Asn Leu Thr Ala Thr Leu Ser Ser 
675 680 685 

act ttg cat ggc aaa gag agt ctg att ggg gaa gat ctg aaa aga gca 2112 
Thr Leu His Gly Lys Glu Ser L u II Gly Glu Asp Leu Lys Arg Ala 
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690 




695 










700 












atg 
Met 
705 


gcg 
Ala 


cct tgc 
Pro Cys 


ttc act teg 
Phe Thr Ser 
710 


get 
Ala 


ttg 
Leu 


cat 
His 


ttg 
Leu 
715 


act 
Thr 


tct 
Ser 


caa 
Gin 


gaa 
Glu 


gtt 
Val 
720 


2160 


gcg 

Ala 


tat 
Tyr 


gac ctg 
Asp Leu 


ctg ttg tgg 
Leu Leu Trp 
725 


ata 
He 


gac 
Asp 


cag 
Gin 
730 


att 
He 


caa 
Gin 


ccg 
Pro 


gca 
Ala 


caa 
Gin 
735 


ata 
He 


2208 


act 
Thr 


gtt 

Val 


gat ggg 
Asp Gly 
740 


ttt tgg gaa 
Phe Trp Glu 


gaa 
Glu 


gtg 
Val 
745 


caa 
Gin 


aca 
Thr 


aca 
Thr 


cca 
Pro 


acc 
Thr 
750 


age 
Ser 


ttg 
Leu 


2256 


aag 
Lys 


gtg 
Val 


att acc 
He Thr 
755 


ttt get cag 
Phe Ala Gin 


gtg 
Val 
760 


ctg 
Leu 


gca 
Ala 


caa 
Gin 


ttg 
Leu 


age 
Ser 
765 


ctg 
Leu 


ate 
He 


tat 
Tyr 


2304 


cgt 
Arg 


cgt 
Arg 
770 


att ggg 
He Gly 


tta agt gaa 
Leu Ser Glu 
775 


acg 
Thr 


gaa 
Glu 


ctg 
Leu 


tea 
Ser 


ctg 
Leu 
780 


ate 
He 


gtg 
Val 


act 
Thr 


caa 
Gin 


2352 


tct 
Ser 
785 


tct 
Ser 


ctg eta 
Leu Leu 


gtg gca ggc 
Val Ala Gly 
790 


aaa 
Lys 


age 
Ser 


ata 
He 


ctg 
Leu 
795 


gat 
Asp 


cac 
His 


ggt 
Gly 


ctg 
Leu 


tta 
Leu 
800 


2400 


acc 
Thr 


ctg 
Leu 


atg gec 
Met Ala 


ttg gaa ggt 
Leu Glu Gly 
805 


ttt 
Phe 


cat 
His 


acc 
Thr 
810 


tgg 
Trp 


gtt 
Val 


aat 
Asn 


ggc 
Gly 


ttg 
Leu 
815 


ggg 

Gly 


2448 


caa 
Gin 


cat 
His 


gec tec 
Ala Ser 
820 


ttg ata ttg 
Leu He Leu 


gcg 
Ala 


gcg 
Ala 
825 


ttg 
Leu 


aaa 
Lys 


gac 
Asp 


gga 
Gly 


gee 
Ala 
830 


ttg 
Leu 


aca 
Thr 


2496 


gtt 
Val 


acc 
Thr 


gat gta 
Asp Val 
835 


gca caa get 
Ala Gin Ala 


atg 
Met 
840 


aat 
Asn 


aag 
Lys 


gag 
Glu 


gaa 
Glu 


tct 
Ser 
845 


etc 
Leu 


eta 
Leu 


caa 
Gin 


2544 


atg 
Met 


gca 
Ala 
850 


get aat 
Ala Asn 


cag gtg gag 
Gin Val Glu 
855 


aag 
Lys 


gat 
Asp 


eta 
Leu 


aca 
Thr 


aaa 
Lys 
860 


ctg 
Leu 


acc 
Thr 


agt 
Ser 


tgg 
Trp 


2592 


aca 
Thr 
865 


cag 
Gin 


att gac 
He Asp 


get att ctg 
Ala He Leu 
870 


caa 
Gin 


tgg 
Trp 


tta 
Leu 


cag 
Gin 
875 


atg 
Met 


tct 
Ser 


teg 
Ser 


gee 
Ala 


ttg 
Leu 
880 


2640 


gcg 
Ala 


gtt 
Val 


tct cca 
Ser Pro 


ctg gat ctg 
Leu Asp Leu 
885 


gca 
Ala 


ggg 

Gly 


atg 
Met 
890 


atg 
Met 


gec 
Ala 


ctg 
Leu 


aaa 
Lys 


tat 
Tyr 
895 


ggg 

Gly 


2688 


ata 
He 


gat 
Asp 


cat aac 
His Asn 
900 


tat get gec 
Tyr Ala Ala 


tgg 
Trp 


caa 
Gin 
905 


get 
Ala 


gcg 
Ala 


gcg 
Ala 


get 
Ala 


gcg 
Ala 
910 


ctg 
Leu 


atg 
Met 


2736 


get 
Ala 


gat 
Asp 


cat get 
His Ala 
915 


aat cag gca 
Asn Gin Ala 


cag 
Gin 
920 


aaa 
Lys 


aaa 
Lys 


ctg 
Leu 


gat 
Asp 


gag 
Glu 
925 


acg 
Thr 


ttc 
Phe 


agt 
Ser 


2784 


aag 
Lys 


gca 
Ala 
930 


tta tgt 
Leu Cys 


aac tat tat 
Asn Tyr Tyr 
935 


att 
He 


aat 
Asn 


get 
Ma 


gtt 
Val 


gtc 
Val 
940 


gat 
Asp 


agt 
Ser 


get 
Ala 


get 
Ala 


2832 
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gga gta cgt gat cgt aac ggt tta tat acc tat ttg ctg att gat aat 2880 
Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu lie Asp Asn 
94 5 950 955 960 

cag gtt tct gcc gat gtg ate act tea cgt att gca gaa get ate gee 2928 
Gin Val Ser Ala Asp Val lie Thr Ser Arg lie Ala Glu Ala He Ala 
965 970 975 

ggt att caa ctg tac gtt aac egg get tta aac cga gat gaa ggt cag 2976 
Gly He Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 
980 985 990 

ctt gca teg gac gtt agt acc cgt cag ttc ttc act gac tgg gaa cgt 3024 
Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 
995 1000 1005 

tac aat aaa cgt tac agt act tgg get ggt gtc tct gaa ctg gtc tat 3072 
Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr 
1010 1015 1020 

tat cca gaa aac tat gtt gat ccc act cag cgc att ggg caa acc aaa 3120 
Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg He Gly Gin Thr Lys 
1025 1030 1035 1040 

atg atg gat gcg ctg ttg caa tec ate aac cag age cag eta aat gcg 3168 
Met Met Asp Ala Leu Leu Gin Ser He Asn Gin Ser Gin Leu Asn Ala 
1045 1050 1055 

gat acg gtg gaa gat get ttc aaa act tat ttg acc age ttt gag cag 3216 
Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 
1060 1065 1070 

gta gca aat ctg aaa gta att agt get tac cac gat aat gtg aat gtg 3264 
Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn Val Asn Val 
1075 1080 1085 

gat caa gga tta act tat ttt ate ggt ate gac caa gca get ccg ggt 3312 
Asp Gin Gly Leu Thr Tyr Phe He Gly He Asp Gin Ala Ala Pro Gly 
1090 1095 1100 

acg tat tac tgg cgt agt gtt gat cac age aaa tgt gaa aat ggc aag 3360 
Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 
1105 1110 1115 1120 

ttt gcc get aat get tgg ggt gag tgg aat aaa att acc tgt get gtc 3408 
Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys He Thr Cys Ala Val 
1125 1130 1135 

aat cct tgg aaa aat ate ate cgt ccg gtt gtt tat atg tec cgc tta 3456 
Asn Pro Trp Lys Asn lie lie Arg Pro Val Val Tyr Met Ser Arg Leu 
1140 1145 1150 

tat ctg eta tgg ctg gag cag caa tea aag aaa agt gat gat ggt aaa 3504 
Tyr Leu Leu Trp Leu Glu Gin Gin Ser Lys Lys Ser Asp Asp Gly Lys 
1155 1160 1165 

acc acg att tat caa tat aac tta aaa ctg get cat att cgt tac gac 3552 
Thr Thr He Tyr Gin Tyr Asn Leu Lys Leu Ala His II Arg Tyr Asp 
1170 1175 1180 
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ggt agt tgg aat aca cca ttt act ttt gat gtg aca gaa aag gta aaa 3600 

Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val Lys 
1185 1190 1195 1200 

aat tac acg teg agt act gat get get gaa tct tta ggg ttg tat tgt 3648 

Asn Tyr Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu Tyr Cys 
1205 1210 1215 

act ggt tat caa ggg gaa gac act eta tta gtt atg ttc tat teg atg 3696 

Thr Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Ser Met 
1220 1225 1230 

cag agt agt tat age tec tat acc gat aat aat gcg ccg gtc act ggg 3744 

Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val Thr Gly 
1235 1240 1245 

eta tat att ttc get gat atg tea tea gac aat atg acg aat gca caa 3792 

Leu Tyr lie Phe Ala Asp Met Ser Ser Asp Asn Met Thr Asn Ala Gin 
1250 1255 1260 

gca act aac tat tgg aat aac agt tat ccg caa ttt gat act gtg atg 3840 

Ala Thr Asn Tyr Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr Val Met 
1265 1270 ~ 1275 1280 

gca gat ccg gat age gac aat aaa aaa gtc ata acc aga aga gtt aat 3888 

Ala Asp Pro Asp Ser Asp Asn Lys Lys Val lie Thr Arg Arg Val Asn 
1285 1290 1295 

aac cgt tat gcg gag gat tat gaa att cct tec tct gtg aca agt aac 3936 

Asn Arg Tyr Ala Glu Asp Tyr Glu lie Pro Ser Ser Val Thr Ser Asn 
1300 1305 1310 

agt aat tat tct tgg ggt gat cac agt tta acc atg ctt tat ggt ggt 3984 

Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Gly 
1315 1320 1325 

agt gtt cct aat att act ttt gaa teg gcg gca gaa gat tta agg eta 4032 

Ser Val Pro Asn lie Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 
1330 1335 1340 

tct acc aat atg gca ttg agt att att cat aat gga tat gcg gga acc 4080 

Ser Thr Asn Met Ala Leu Ser He He His Asn Gly Tyr Ala Gly Thr 
1345 1350 1355 1360 

cgc cgt ata caa tgt aat ctt atg aaa caa tac get tea tta ggt gat 4128 

Arg Arg He Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp 
1365 1370 1375 

aaa ttt ata att tat gat tea tea ttt gat gat gca aac cgt ttt aat 4176 

Lys Phe He He Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 
1380 1385 1390 

ctg gtg cca ttg ttt aaa ttc gga aaa gac gag aac tea gat gat agt 4224 

Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Glu Asn Ser Asp Asp Ser 
1395 1400 1405 

att tgt ata tat aat gaa aac cct tec tct gaa gat aag aag tgg tat 4272 

He Cys He Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr 
1410 1415 1420 

ttt tct teg aaa gat gac aat aaa aca gcg gat tat aat ggt gga act 4 320 
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Phe Ser Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Gly Gly Thr 
1425 1430 * 1435 1440 

caa tgt ata gat get gga acc agt aac aaa gat ttt tat tat aat etc 4368 
Gin Cys lie Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 
1445 1450 1455 

cag gag att gaa gta att agt gtt act ggt ggg tat tgg teg agt tat 4416 
Gin Glu He Glu Val He Ser Val Thr Gly Gly Tyr Trp Ser Ser Tyr 
1460 1465 1470 

aaa ata tec aac ccg att aat ate aat acg ggc att gat agt get aaa 4464 
Lys He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys 
1475 1480 1485 

gta aaa gtc acc gta aaa gcg ggt ggt gac gat caa ate ttt act get 4512 
Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin He Phe Thr Ala 
1490 1495 1500 

gat aat agt acc tat gtt cct cag caa ccg gca ccc agt ttt gag gag 4560 
Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Glu Glu 
1505 1510 1515 1520 

atg att tat cag ttc aat aac ctg aca ata gat tgt aag aat tta aat 4608 
Met He Tyr Gin Phe Asn Asn Leu Thr He Asp Cys Lys Asn Leu Asn 
1525 1530 1535 

ttc ate gac aat cag gca cat att gag att gat ttc acc get acg gca 4656 
Phe He Asp Asn Gin Ala His He Glu He Asp Phe Thr Ala Thr Ala 
1540 1545 1550 

caa gat ggc cga ttc ttg ggt gca gaa act ttt att ate ccg gta act 4704 
Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe He He Pro Val Thr 
1555 1560 1565 

aaa aaa gtt etc ggt act gag aac gtg att gcg tta tat age gaa aat 4752 
Lys Lys Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn 
1570 1575 1580 

aac ggt gtt caa tat atg caa att ggc gca tat cgt acc cgt ttg aat 4800 
Asn Gly Val Gin Tyr Met Gin He Gly Ala Tyr Arg Thr Arg Leu Asn 
1585 1590 1595 1600 

acg tta ttc get caa cag ttg gtt age cgt get aat cgt ggc att gat 4848 
Thr Leu Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Gly He Asp 
1605 1610 1615 

gca gtg etc agt atg gaa act cag aat att cag gaa ccg caa tta gga 4896 
Ala Val Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly 
1620 1625 1630 

gcg ggc aca tat gtg cag ctt gtg ttg gat aaa tat gat gag tct att 4944 
Ala Gly Thr Tyr Val Gin Leu Val Leu Asp Lys Tyr Asp Glu Ser He 
1635 1640 1645 

cat ggc act aat aaa age ttt get att gaa tat gtt gat ata ttt aaa 4992 
His Gly Thr Asn Lys Ser Phe Ala He Glu Tyr Val Asp He Phe Lys 
1650 1655 1660 

gag aac gat agt ttt gtg att tat caa gga gaa ctt age gaa aca agt 5040 
Glu Asn Asp Ser Ph Val He Tyr Gin Gly Glu Leu Ser Glu Thr Ser 
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1665 1670 1675 1680 

caa act gtt gtg aaa gtt ttc tta tec tat ttt ata gag gcg act gga 5088 
Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe lie Glu Ala Thr Gly 
1685 1690 1695 

aat aag aac cac tta tgg gta cgt get aaa tac caa aag gaa acg act 5136 
Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Glu Thr Thr 
1700 1705 1710 

gat aag ate ttg ttc gac cgt act gat gag aaa gat ccg cac ggt tgg 5184 
Asp Lys lie Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp 
1715 1720 1725 

ttt etc age gac gat cac aag ace ttt agt ggt etc tct tec gca cag 5232 
Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 
1730 1735 1740 

gca tta aag aac gac agt gaa ccg atg gat ttc tct ggc gee aat get 5280 
Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ala 
1745 1750 1755 1760 

etc tat ttc tgg gaa ctg ttc tat tac acg ccg atg atg atg get cat 5328 
Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met Met Ala His 
1765 1770 1775 

cgt ttg ttg cag gaa cag aat ttt gat gcg gcg aac cat tgg ttc cgt 5376 
Arg Leu Leu Gin Glu Gin Asn Phe Asp Ala Ala Asn His Trp Phe Arg 
1780 1785 1790 

tat gtc tgg agt cca tec ggt tat ate gtt gat ggt aaa att get ate 5424 
Tyr Val Trp Ser Pro Ser Gly Tyr lie Val Asp Gly Lys He Ala He 
1795 1800 1805 

tac cac tgg aac gtg cga ccg ctg gaa gaa gac ace agt tgg aat gca 5472 
Tyr His Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala 
1810 1815 1820 

caa caa ctg gac tec ace gat cca gat get gta gee caa gat gat ccg 5520 
Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp Asp Pro 
1825 1830 1835 1840 

atg cac tac aag gtg get acc ttt atg gcg acg ttg gat ctg eta atg ^ 5568 
Met His Tyr Lys Val Ala Thr Phe Met Ala Thr Leu Asp Leu Leu Met 
1845 1850 1855 

gec cgt ggt gat get get tac cgc cag tta gag cgt gat acg ttg get 5616 
Ala Arg Gly Asp Ala Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Ala 
1860 1865 1870 

gaa get aaa atg tgg tat aca cag gcg ctt aat ctg ttg ggt gat gag 5664 
Glu Ala Lys Met Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly Asp Glu 
1875 1880 1885 

cca caa gtg atg ctg agt acg act tgg get aat cca aca ttg ggt aat 5712 
Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu Gly Asn 
1890 1895 1900 

get get tea aaa acc aca cag cag gtt cgt cag caa gtg ctt acc cag 5760 
Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu Thr Gin 
1905 1910 1915 1920 
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ttg cgt etc aat age agg gta aaa ace ccg ttg eta gga aca gec aat 5808 
Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr Ala Asn 
1925 1930 1935 

tec ctg acc get tta ttc ctg ccg cag gaa aat age aag etc aaa ggc 5856 
Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu Lys Gly 
1940 1945 1950 

tac tgg egg aca ctg gcg cag cgt atg ttt aat tta cgt cat aat ctg 5904 
Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn Leu Arg His Asn Leu 
1955 1960 1965 

teg att gac ggc cag ccg etc tec ttg ccg ctg tat get aaa ccg get 5952 
Ser He Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro Ala 
1970 1975 1980 

gat cca aaa get tta ctg agt gcg gcg gtt tea get tct caa ggg gga 6000 
Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Gly 
1985 1990 1995 2000 

gee gac ttg ccg aag gcg ccg ctg act att cac cgc ttc cct caa atg 6048 
Ala Asp Leu Pro Lys Ala Pro Leu Thr He His Arg Phe Pro Gin Met 
2005 2010 2015 

eta gaa ggg gca egg ggc ttg gtt aac cag ctt ata cag ttc ggt agt 6096 
Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu He Gin Phe Gly Ser 
2020 2025 2030 

tea eta ttg ggg tac agt gag cgt cag gat gcg gaa get atg agt caa 6144 
Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met Ser Gin 
2035 2040 2045 

eta ctg caa acc caa gec age gag tta ata ctg acc agt att cgt atg 6192 
Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu Thr Ser He Arg Met 
2050 2055 2060 

cag gat aac caa ttg gca gag ctg gat teg gaa aaa acc gee ttg caa 6240 
Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gin 
2065 2070 2075 2080 

gtc tct tta get gga gtg caa caa egg ttt gac age tat age caa ctg 6288 
Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 
2085 2090 2095 

tat gag gag aac ate aac gca ggt gag cag cga gcg ctg gcg tta cgc 6336 
Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arg Ala Leu Ala Leu Arg 
2100 2105 2110 

tea gaa tct get att gag tct cag gga gcg cag att tec cgt atg gca 6384 
Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin He Ser Arg Met Ala 
2115 2120 2125 

ggc gcg ggt gtt gat atg gca cca aat ate ttc ggc ctg get gat ggc 6432 
Gly Ala Gly Val Asp Met Ala Pro Asn lie Phe Gly Leu Ala Asp Gly 
2130 2135 2140 

ggc atg cat tat ggt get att gec tat gee ate get gac ggt att gag 6480 
Gly Met His Tyr Gly Ala He Ala Tyr Ala He Ala Asp Gly He Glu 
2145 2150 2155 2160 
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ttg agt get tct gec aag atg gtt gat gcg gag aaa gtt get cag teg 6528 
Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala Gin Ser 
2165 2170 2175 

gaa ata tat cgc cgt cgc cgt caa gaa tgg aaa att cag cgt gac aac 6576 
Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys He Gin Arg Asp Asn 
2180 2185 2190 

gca caa gcg gag att aac cag tta aac gcg caa ctg gaa tea ctg tct 6624 
Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu Ser 
2195 2200 2205 

att cgc cgt gaa gec get gaa atg caa aaa gag tac ctg aaa acc cag 6672 
He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys Thr Gin 
2210 2215 2220 

caa get cag gcg cag gca caa ctt act ttc tta aga age aaa ttc agt 6720 
Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys Phe Ser 
2225 2230 2235 2240 

aat caa gcg tta tat agt tgg tta cga ggg cgt ttg tea ggt att tat 6768 
Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly He Tyr 
2245 2250 2255 

ttc cag ttc tat gac ttg gee gta tea cgt tgc ctg atg gca gag caa 6816 
Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala Glu Gin 
2260 2265 2270 

tec tat caa tgg gaa get aat gat aat tec att age ttt gtc aaa ccg 6864 
Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser He Ser Phe Val Lys Pro 
2275 2280 2285 

ggt gca tgg caa gga act tac gec ggc tta ttg tgt gga gaa get ttg 6912 
Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 
2290 2295 2300 

ata caa aat ctg gca caa atg gaa gag gca tat ctg aaa tgg gaa tct 6960 
He Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr Leu Lys Trp Glu Ser 
2305 2310 2315 2320 

cgc get ttg gaa gta gaa cgc acg gtt tea ttg gca gtg gtt tat gat 7008 
Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val Tyr Asp 
2325 2330 2335 

tea ctg gaa ggt aat gat cgt ttt aat tta gcg gaa caa ata cct gca 7056 
Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin He Pro Ala 
2340 2345 2350 

tta ttg gat aag ggg gag gga aca gca gga act aaa gaa aat ggg tta 7104 
Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 
2355 2360 2365 

tea ttg get aat get ate ctg tea get teg gtc aaa ttg tec gac ttg 7152 
Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 
2370 2375 2380 

aaa ctg gga acg gat tat cca gac agt ate gtt ggt age aac aag gtt 7200 
Lys Leu Gly Thr Asp Tyr Pro Asp Ser lie Val Gly Ser Asn Lys Val 
2385 2390 2395 2400 



cgt cgt att aag caa ate agt gtt teg eta cct gca ttg gtt ggg cct 
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Arg Arg lie Lys Gin lie Ser Val Ser Leu Pro Ala Leu Val Gly Pro 
2405 2410 2415 

tat cag gat gtt cag get atg etc age tat ggt ggc agt act caa ttg 7296 
Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr Gin Leu 
2420 2425 2430 

ccg aaa ggt tgt tea gcg ttg get gtg tct cat ggt ace aat gat agt 7344 
Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 
2435 2440 2445 

ggt cag ttc cag ttg gat ttc aat gac ggc aaa tac ctg cca ttt gaa 7392 
Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu 
2450 2455 2460 

ggt att get ctt gat gat cag ggt aca ctg aat ctt caa ttt ccg aat 7440 
Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 
2465 2470 2475 2480 

get acc gac aag cag aaa gca ata ttg caa act atg age gat att att 7488 
Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr Met Ser Asp He He 
2485 2490 2495 

ttg cat att cgt tat acc ate cgt taa 7515 
Leu His He Arg Tyr Thr He Arg 
2500 



<210> 3 ■ 
<211> 7577 
<212> DNA 

<213> Artificial Sequence 

<220> 

<221> CDS 

<222> (3). . (7553) 

<220> 

<223> Description of Artificial Sequence : hemi cot tcdA 
<400> 3 

cc atg get aac gag tec gtc aag gag ate cca gac gtc etc aag tec 47 
Met Ala Asn Glu Ser Val Lys Glu He Pro Asp Val Leu Lys Ser 
15 10 15 

caa tgc ggt ttc aac tgc etc act gac ate tec cac age tec ttc aac 95 
Gin Cys Gly Phe Asn Cys Leu Thr Asp He Ser His Ser Ser Phe Asn 
20 25 30 

gag ttc aga caa caa gtc tct gag cac etc tec tgg tec gag acc cat 143 
Glu Phe Arg Gin Gin Val Ser Glu His Leu Ser Trp Ser Glu Thr His 
35 40 45 

gac etc tac cat gac get cag caa get cag aag gac aac agg etc tac 191 
Asp Leu Tyr His Asp Ala Gin Gin Ala Gin Lys Asp Asn Arg Leu Tyr 
50 55 60 

gag get agg ate etc aag agg get aac cca caa etc cag aac get gtc 239 
Glu Ala Arg lie Leu Lys Arg Ala Asn Pro Gin Leu Gin Asn Ala Val 
65 ' 70 75 
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cac etc gec ate ttg get cca aac get gag ttg att ggt tac aac aac 287 
His Leu Ala He Leu Ala Pro Asn Ala Glu Leu He Gly Tyr Asn Asn 
80 85 90 95 

cag ttc tct ggc aga get age cag tac gtg get cct ggt aca gtc tec 335 
Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val Ala Pro Gly Thr Val Ser 
100 105 110 

tec atg ttc age cca gec get tac etc act gag ttg tac cgc gag get 383 
Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala 
115 120 125 

agg aac ctt cat get tct gac tec gtc tac tac ttg gac aca cgc aga 431 
Arg Asn Leu His Ala Ser Asp Ser Val Tyr Tyr Leu Asp Thr Arg Arg 
130 135 140 

cca gac etc aag age atg gee etc age caa cag aac atg gac att gag 479 
Pro Asp Leu Lys Ser Met Ala Leu Ser Gin Gin Asn Met Asp He Glu 
145 150 155 

ttg tec ace etc tec ttg age aac gag ctt etc ttg gag tec ate aag 527 
Leu Ser Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser He Lys 
160 165 170 175 

act gag age aag ttg gag aac tac ace aag gtc atg gag atg etc tec 575 
Thr Glu Ser Lys Leu Glu Asn Tyr Thr Lys Val Met Glu Met Leu Ser 
180 185 190 

acc ttc aga cca age ggt gca act cca tac cat gat gee tac gag aac 623 
Thr Phe Arg Pro Ser Gly Ala Thr Pro Tyr His Asp Ala Tyr Glu Asn 
195 200 205 

gtc agg gag gtc ate caa ctt caa gac cct ggt ctt gag caa etc aac 671 
Val Arg Glu Val He Gin Leu Gin Asp Pro Gly Leu Glu Gin Leu Asn 
210 215 220 

get tct cca gee att get ggt ttg atg cac cag gca tec ttg etc ggt 719 
Ala Ser Pro Ala lie Ala Gly Leu Met His Gin Ala Ser Leu Leu Gly 
225 230 235 

ate aac gee tec ate tct cct gag ttg ttc aac ate ttg act gag gag 767 
He Asn Ala Ser He Ser Pro Glu Leu Phe Asn He Leu Thr Glu Glu 
240 245 250 255 

ate act gag ggc aac get gag gag ttg tac aag aag aac ttc ggc aac 815 
He Thr Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Gly Asn 
260 265 270 

att gag cca gee tct ctt gca atg cct gag tac etc aag agg tac tac 863 
He Glu Pro Ala Ser Leu Ala Met Pro Glu Tyr Leu Lys Arg Tyr Tyr 
275 280 285 

aac ttg tct gat gag gag ctt tct caa ttc att ggc aag get tec aac 911 
Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe He Gly Lys Ala Ser Asn 
290 295 300 

ttc ggt caa cag gag tac age aac aac cag etc ate act cca gtt gtg 959 
Phe Gly Gin Gin Glu Tyr S r Asn Asn Gin Leu He Thr Pro Val Val 
305 310 315 
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aac tec tct gat ggc act gtg aag gtc tac cgc ate aca cgt gag tac 1007 
Asn Ser Ser Asp Gly Thr Val Lys Val Tyr Arg He Thr Arg Glu Tyr 
320 325 330 335 

acc aca aac gec tac caa atg gat gtt gag ttg ttc cca ttc ggt ggt 1055 
Thr Thr Asn Ala Tyr Gin Met Asp Val Glu Leu Phe Pro Phe Gly Gly 
340 345 350 

gag aac tac aga ctt gac tac aag ttc aag aac ttc tac aac gec tec 1103 
Glu Asn Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser 
355 360 365 

tac etc tec ate aag ttg aac gac aag agg gag ctt gtc agg act gag 1151 
Tyr Leu Ser He Lys Leu Asn Asp Lys Arg Glu Leu Val Arg Thr Glu 
370 375 380 

ggt get cct caa gtg aac att gag tac tct gec aac ate acc etc aac 1199 
Gly Ala Pro Gin Val Asn He Glu Tyr Ser Ala Asn He Thr Leu Asn 
385 390 395 

aca get gac ate tct caa cca ttc gag att ggt ttg acc aga gtc ctt 1247 
Thr Ala Asp He Ser Gin Pro Phe Glu lie Gly Leu Thr Arg Val Leu 
400 405 410 415 

ccc tct ggc tec tgg gee tac get gca gee aag ttc act gtt gag gag 1295 
Pro Ser Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu 
420 425 430 

tac aac cag tac tct ttc etc ttg aag etc aac aag gca att cgt etc 1343 
Tyr Asn Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys Ala He Arg Leu 
435 440 445 

age aga gee act gag ttg tct ccc acc ate ttg gag ggc att gtg agg 1391 
Ser Arg Ala Thr Glu Leu Ser Pro Thr He Leu Glu Gly He Val Arg 
450 455 460 

tct gtc aac ctt caa ctt gac ate aac act gat gtg ctt ggc aag gtc 1439 
Ser Val Asn Leu Gin Leu Asp He Asn Thr Asp Val Leu Gly Lys Val 
465 470 475 

ttc etc acc aag. tac tac atg caa cgc tac gec ate cat get gag act 1487 
Phe Leu Thr Lys Tyr Tyr Met Gin Arg Tyr Ala He His Ala Glu Thr 
480 485 490 495 

gca etc ate etc tgc aac gca ccc ate tct caa cgc tec tac gac aac 1535 
Ala Leu He Leu Cys Asn Ala Pro He Ser Gin Arg Ser Tyr Asp Asn 
500 505 510 

cag cct tec cag ttc gac agg etc ttc aac act cct etc ttg aac ggc 1583 
Gin Pro Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Gly 
515 520 525 

cag tac ttc tec act ggt gat gag gag att gac etc aac tct ggc tec 1631 
Gin Tyr Phe Ser Thr Gly Asp Glu Glu He Asp Leu Asn Ser Gly Ser 
530 535 540 

aca ggt gac tgg aga aag acc ate ttg aag agg gec ttc aac att gat 1679 
Thr Gly Asp Trp Arg Lys Thr lie Leu Lys Arg Ala Phe Asn He Asp 
545 " 550 555 

gat gtc tct etc ttc cgt etc ttg aag ate aca gat cac gac aac aag 1727 
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Asp Val Ser Leu Phe Arg Leu Leu Lys lie Thr Asp His Asp Asn Lys 

560 565 570 575 

gat ggc aag ate aag aac aac ttg aag aac ctt tec aac etc tac att 1775 

Asp Gly Lys He Lys Asn Asn Leu Lys Asn Leu Ser Asn Leu Tyr He 

580 585 590 

ggc aag ttg ctt gca gac ate cac caa etc acc att gat gag ttg gac 1823 

Gly Lys Leu Leu Ala Asp He His Gin Leu Thr He Asp Glu Leu Asp 

595 600 605 

etc ttg etc att gca gtc ggt gag ggc aag acc aac etc tct gca ate 1871 

Leu Leu Leu He Ala Val Gly Glu Gly Lys Thr Asn Leu Ser Ala lie 

610 615 620 

tct gac aag cag ttg gca acc etc ate agg aag ttg aac acc ate acc 1919 

Ser Asp Lys Gin Leu Ala Thr Leu He Arg Lys Leu Asn Thr He Thr 

625 630 635 

tec tgg ctt cac acc cag aag tgg tct gtc ttc caa etc ttc ate atg 1967 

Ser Trp Leu His Thr Gin Lys Trp Ser Val Phe Gin Leu Phe lie Met 

640 645 650 655 

acc age acc tec tac aac aag acc etc act cct gag ate aag aac etc 2015 

Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr Pro Glu lie Lys Asn Leu 

660 665 670 

ttg gac aca gtc tac .cac ggt etc caa ggc ttc gac aag gac aag get 2063 

Leu Asp Thr Val Tyr His Gly Leu Gin Gly Phe Asp Lys Asp Lys Ala 

675 680 685 

gac ttg ctt cat gtc atg get ccc tac att gca gec acc etc caa etc 2111 

Asp Leu Leu His Val Met Ala Pro Tyr lie Ala Ala Thr Leu Gin Leu 

690 695 700 

tec tct gag aac gtg get cac tct gtc ttg etc tgg get gac aag etc 2159 

Ser Ser Glu Asn Val Ala His Ser Val Leu Leu Trp Ala Asp Lys Leu 

705 710 715 

caa cct ggt gat ggt gee atg act get gag aag ttc tgg gac tgg etc 2207 

Gin Pro Gly Asp Gly Ala Met Thr Ala Glu Lys Phe Trp Asp Trp Leu 

720 725 730 735 

aac acc aag tac aca cca ggc tec tct gag get gtt gag act caa gag 2255 

Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gin Glu 

740 745 750 

cac att gtg caa tac tgc cag get ctt gca cag ttg gag atg gtc tac 2303 

His He Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu Glu Met Val Tyr 

755 760 765 

cac tec act ggc ate aac gag aac get ttc aga etc ttc gtc acc aag 2351 

His Ser Thr Gly lie Asn Glu Asn Ala Phe Arg Leu Phe Val Thr Lys 

770 775 780 

cct gag atg ttc ggt get gee aca ggt get gca cct get cat gat get 2399 

Pro Glu Met Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala 

785 790 795 

etc tec etc ate atg ttg acc agg ttc get gac tgg gtc aac get ctt 2447 

Leu Ser L u lie Met Leu Thr Arg Phe Ala Asp Trp Val Asn Ala Leu 
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800 805 810 815 

ggt gag aag get tec tct gtc ttg get gee ttc gag gee aac tec etc 2495 

Gly Glu Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu 
820 825 830 

act get gag caa ctt get gat gee atg aac ctt gat gee aac etc ttg 2543 

Thr Ala Glu Gin Leu Ala Asp Ala Met Asn Leu Asp Ala Asn Leu Leu 
835 840 845 

etc caa get tec att caa get cag aac cac caa cac etc cca cct gtc 2591 

Leu Gin Ala Ser lie Gin Ala Gin Asn His Gin His Leu Pro Pro Val 
850 855 860 

act cca gag aac get ttc tec tgc tgg acc tec ate aac ace ate etc 2639 

Thr Pro Glu Asn Ala Phe Ser Cys Trp Thr Ser He Asn Thr He Leu 
865 870 875 

caa tgg gtc aac gtg get cag caa etc aac gtg get cca caa ggt gtc 2687 

Gin Trp Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val 

880 885 890 895 

tct get ttg gtc ggt ctt gac tac ate cag tec atg aag gag aca cca 2735 

Ser Ala Leu Val Gly Leu Asp Tyr He Gin Ser Met Lys Glu Thr Pro 
900 905 910 

acc tac get caa tgg gag aac gca get ggt gtc ttg act get ggt etc 2783 

Thr Tyr Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu 
915 920 925 

aac tec caa cag gec aac acc etc cat get ttc ttg gat gag tct cgc 2831 

Asn Ser Gin Gin Ala Asn Thr Leu His Ala Phe Leu Asp Glu Ser Arg 
930 935 940 

tct get gec etc tec acc tac tac ate agg caa gtc gee aag gca get 2879 

Ser Ala Ala Leu Ser Thr Tyr Tyr He Arg Gin Val Ala Lys Ala Ala 
945 950 955 

get gec ate aag tct cgc gat gac etc tac caa tac etc etc att gac 2927 

Ala Ala He Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu He Asp 

960 ^ 965 970 975 

aac cag gtc tct get gec ate aag acc acc agg ate get gag gec ate 2975 

Asn Gin Val Ser Ala Ala He Lys Thr Thr Arg He Ala Glu Ala He 
980 985 990 

get tec ate caa etc tac gtc aac cgc get ctt gag aac gtt gag gag 3023 

Ala Ser He Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu 
995 1000 1005 

aac gec aac tct ggt gtc ate tct cgc caa ttc ttc ate gac tgg gac 3071 
Asn Ala Asn Ser Gly Val He Ser Arg Gin Phe Phe He Asp Trp Asp 
1010 1015 1020 

aag tac aac aag agg tac tec acc tgg get ggt gtc tct caa ctt gtc 3119 

Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val 
1025 1030 1035 

tac tac cca gag aac tac att gac cca acc atg agg att ggt cag acc 3167 

Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr Met Arg He Gly Gin Thr 

1040 1045 1050 1055 
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aag atg atg gat get etc ttg caa tct gtc tec caa age caa etc aac 3215 
Lys Met Met Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn 
1060 1065 1070 

get gac act gtg gag gat gee ttc atg age tac etc acc tec ttc gag 3263 
Ala Asp Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu 
1075 1030 1085 

caa gtt gec aac etc aag gtc ate tct get tac cat gac aac ate aac 3311 
Gin Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn He Asn 
1090 1095 1100 

aac gac caa ggt etc acc tac ttc att ggt etc tct gag act gat get 3359 
Asn Asp Gin Gly Leu Thr Tyr Phe He Gly Leu Ser Glu Thr Asp Ala 
1105 1110 1115 

ggt gag tac tac tgg aga tec gtg gac cac age aag ttc aac gat ggc 3407 
Gly Glu Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly 
1120 * 1125 1130 1135 

aag ttc get gca aac get tgg tct gag tgg cac aag att gac tgc cct 3455 
Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp His Lys He Asp Cys Pro 
1140 1145 1150 

ate aac cca tac aag tec acc ate aga cct gtc ate tac aag age cgc 3503 
He Asn Pro Tyr Lys Ser Thr He Arg Pro Val He Tyr Lys Ser Arg 
1155 1160 1165 

etc tac ttg etc tgg ctt gag cag aag gag ate acc aag caa act ggc 3551 
Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu He Thr Lys Gin Thr Gly 
1170 1175 1180 

aac tec aag gat ggt tac caa act gag act gac tac cgc tac gag ttg 3599 
Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu 
1185 1190 1195 

aag ttg get cac ate cgc tac gat ggt acc tgg aac act cca ate acc 3647 
Lys Leu Ala His He Arg Tyr Asp Gly Thr Trp Asn Thr Pro He Thr 
1200 1205 1210 1215 

. ttc gat gtc aac aag aag ate age gag ttg aag ttg gag aag aac cgt 3695 
Phe Asp Val Asn Lys Lys He Ser Glu Leu Lys Leu Glu Lys Asn Arg 
1220 1225 1230 

get cct ggt etc tac tgc get ggt tac caa ggt gag gac acc etc ttg 3743 
Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu 
1235 1240 1245 

gtc atg ttc tac aac cag caa gac acc ctt gac tec tac aag aac get 3791 
Val Met Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser Tyr Lys Asn Ala 
1250 1255 1260 

tec atg caa ggt etc tac ate ttc get gac atg get tec aag gac atg 3839 
Ser Met Gin Gly Leu Tyr He Phe Ala Asp Met Ala Ser Lys Asp Met 
1265 1270 1275 

act cca gag caa age aac gtc tac cgt gac aac tec tac caa cag ttc 3887 
Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe 
1280 1285 1290 1295 
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gac acc aac aac gtc agg cgt gtc aac aac aga tac get gag gac tac 3935 
Asp Thr Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr 
1300 1305 131C 

gag ate cca age tct gtc age tct cgc aag gac tac ggc tgg ggt gac 3983 
Glu lie Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp 
1315 1320 1325 

tac tac etc age atg gtg tac aac ggt gac ate cca acc ate aac tac 4031 
Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp lie Pro Thr lie Asn Tyr 
1330 1335 1340 

aag get gee tct tec gac etc aaa ate tac ate age cca aag etc agg 4079 
Lys Ala Ala Ser Ser Asp Leu Lys lie Tyr lie Ser Pro Lys Leu Arg 
1345 1350 1355 

ate ate cac aac ggc tac gag ggt cag aag agg aac cag tgc aac ttg 4127 
He He His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu 
1360 1365 1370 1375 

atg aac aag tac ggc aag ttg ggt gac aag ttc att gtc tac acc tct 4175 
Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys Phe He Val Tyr Thr Ser 
1380 1385 1390 

ctt ggt gtc aac cca aac aac age tec aac aag etc atg ttc tac cca 4223 
Leu Gly Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro 
1395 1400 1405 

gtc tac caa tac tct ggc aac acc tct ggt etc aac cag ggt aga etc 4271 
Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu 
1410 1415 1420 

ttg ttc cac agg gac acc acc tac cca age aag gtg gag get tgg att 4319 
Leu Phe His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp He 
1425 1430 1435 

cct ggt gee aag agg tec etc acc aac cag aac get gee att ggt gat 4367 
Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala He Gly Asp 
1440 1445 1450 1455 

gac tac gec aca gac tec etc aac aag cct gat gac etc aag cag tac 4415 
Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr 
1460 1465 1470 

ate ttc atg act gac tec aag ggc aca gee act gat gtc tct ggt cca 4463 
He Phe Met Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro 
1475 1480 1485 

gtg gag ate aac act gca ate age cca gec aag gtc caa ate att gtc 4511 
Val Glu He Asn Thr Ala He Ser Pro Ala Lys Val Gin He He Val 
1490 1495 1500 

aag get ggt ggc aag gag caa acc ttc aca get gac aag gat gtc tec 4559 
Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser 
1505 1510 1515 

ate cag cca age cca tec ttc gat gag atg aac tac caa ttc aac get 4607 
He Gin Pro Ser Pro Ser Phe Asp Glu Met Asn Tyr Gin Phe Asn Ala 
1520 1525 1530 * 1535 

ctt gag att gat ggt tct ggc etc aac ttc ate aac aac tct get tec 4655 
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Leu Glu lie Asp Gly Ser Gly L u Asn Phe He Asn Asn Ser Ala Ser 
1540 1545 1550 

att gat gtc acc ttc act gcc ttc get gag gat ggc cgc aag ttg ggt 4703 
He Asp Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly 
1555 1560 1565 

tac gag age ttc tec ate cca gtc acc ctt aag gtt tec act gac aac 4751 
Tyr Glu Ser Phe Ser He Pro Val Thr Leu Lys Val Ser Thr Asp Asn 
1570 1575 1580 

gca etc acc ctt cat cac aac gag aac ggt get cag tac atg caa tgg 4799 
Ala Leu Thr Leu His His Asn Glu Asn Gly Ala Gin Tyr Met Gin Trp 
1585 1590 1595 

caa age tac cgc acc agg ttg aac acc etc ttc gca agg caa ctt gtg 4847 
Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val . 
1600 1605 1610 1615 

gcc cgt gcc acc aca ggc att gac acc ate etc age atg gag acc cag 4895 
Ala Arg Ala Thr Thr Gly He Asp Thr He Leu Ser Met Glu Thr Gin 
1620 1625 1630 

aac ate caa gag cca cag ttg ggc aag ggt ttc tac gcc acc ttc gtc 4943 
Asn He Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val 
1635 1640 1645 

ate cca cct tac aac etc age act cat ggt gat gag agg tgg ttc aag 4991 
He Pro Pro Tyr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys 
1650 1655 " 1660 

etc tac ate aag cac gtg gtt gac aac aac tec cac ate ate tac tct 5039 
Leu Tyr He Lys His Val Val Asp Asn Asn Ser His He He Tyr Ser 
1665 1670 1675 

ggt caa etc act gac acc aac ate aac ate acc etc ttc ate cca ctt 5087 
Gly Gin Leu Thr Asp Thr Asn He Asn He Thr Leu Phe He Pro Leu 
1680 1685 1690 1695 

gac gat gtc cca etc aac cag gac tac cat gcc aag gtc tac atg acc 5135 
Asp Asp Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr 
1700 1705 1710 

ttc aag aag tct cca tct gat ggc acc tgg tgg ggt cca cac ttc gtc 5183 
Phe Lys Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val 
1715 1720 1725 

cgt gat gac aag ggc ate gtc acc ate aac cca aag tec ate etc acc 5231 
Arg Asp Asp Lys Gly He Val Thr He Asn Pro Lys Ser He Leu Thr 
1730 1735 1740 

cac ttc gag tct gtc aac gtt etc aac aac ate tec tct gag cca atg 5279 
His Phe Glu Ser Val Asn Val Leu Asn Asn He Ser Ser Glu Pro Met 
1745 1750 1755 

gac ttc tct ggt gcc aac tec etc tac ttc tgg gag ttg ttc tac tac 5327 
Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr 
1760 1765 1770 1775 

aca cca atg ctt gtg get caa agg ttg etc cat gag cag aac ttc gat 5375 
Thr Pro Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp 
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1780 1785 1790 

gag gcc aac agg tgg etc aag tac gtc tgg age cca tct ggt tac att 5423 
Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr He 
1795 1800 1805 

gtg cat ggt caa ate cag aac tac caa tgg aac gtc agg cca ttg ctt 5471 
Val His Gly Gin He Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu 
1810 1815 1820 

gag gac acc tec tgg aac tct gac cca ctt gac tct gtg gac cct gat 5519 
Glu Asp Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp 
1825 1830 1835 

get gtg get caa cat gac cca atg cac tac aag gtc tec acc ttc atg 5567 
Ala Val Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met 
1840 1845 1850 1855 

agg acc ttg gac etc ttg att gcc aga ggt gac cat get tac cgc caa 5615 
Arg Thr Leu Asp Leu Leu He Ala Arg Gly Asp His Ala Tyr Arg Gin 
1860 1865 1870 

ttg gag agg gac acc etc aac gag gca aag atg tgg tac atg caa get 5663 
Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys Met Trp Tyr Met Gin Ala 
1875 1880 1885 

etc cac etc ttg ggt gac aag cca tac etc cca etc age acc act tgg 5711 
Leu His Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Trp 
1890 1895 1900 

tec gac cca agg ttg gac cgt get get gac ate acc act cag aac get 5759 
Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp He Thr Thr Gin Asn Ala 
1905 1910 1915 

cat gac tct gcc att gtt get etc agg cag aac ate cca act cct get 5807 
His Asp Ser Ala He Val Ala Leu Arg Gin Asn He Pro Thr Pro Ala 
1920 1925 1930 1935 

cca etc tec etc aga tct get aac acc etc act gac ttg ttc etc cca 5855 
Pro Leu Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro 
1940 1945 1950 

cag ate aac gag gtc atg atg aac tac tgg caa acc ttg get caa agg 5903 
Gin He Asn Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg 
1955 1960 1965 

gtc tac aac etc aga cac aac etc tec att gat ggt caa cca etc tac 5951 
Val Tyr Asn Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr 
1970 1975 1980 

etc cca ate tac gcc aca cca get gac cca aag get ctt etc tct get 5999 
Leu Pro lie Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala 
1985 * 1990 1995 

get gtg get acc age caa ggt ggt ggc aag etc cca gag tec ttc atg 6047 
Ala Val Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met 
2000 2005 2010 2015 

tec etc tgg agg ttc cca cac atg ttg gag aac gcc cgt ggc atg gtc 6095 
Ser Leu Trp Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val 
2020 2025 2030 
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tec caa etc acc cag ttc ggt tec ace etc cag aac ate att gag agg 6143 
Ser Gin Leu Thr Gin Phe Gly S r Thr Leu Gin Asn lie lie Glu Arg 
2035 2040 2045 

caa gat get gag get etc aac get ttg etc cag aac cag gca get gag 6191 
Gin Asp Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu 
2050 2055 2060 

ttg ate etc acc aac ttg tec ate caa gac aag acc att gag gag ctt 6239 
Leu lie Leu Thr Asn Leu Ser He Gin Asp Lys Thr He Glu Glu Leu 
2065 2070 2075 

gat get gag aag aca gtc ctt gag aag age aag get ggt gee caa tct 6287 
Asp Ala Glu Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser 
2080 * 2085 2090 2095 

cgc ttc gac tec tac ggc aag etc tac gat gag aac ate aac get ggt 6335 
Arg Phe Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly 
2100 2105 2110 

gag aac cag gee atg acc etc agg get tec gca get ggt etc acc act 6383 
Glu Asn Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr 
2115 2120 2125 

get gtc caa gee tct cgc ttg get ggt gca get get gac etc gtt cca 6431 
Ala Val Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro 
2130 2135 2140 

aac ate ttc ggt ttc get ggt ggt ggc tec aga tgg ggt gee att get 6479 
Asn He Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala 
2145 2150 2155 

gag get acc ggt tac gtc atg gag ttc tct gee aac gtc atg aac act 6527 
Glu Ala Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr 
2160 2165 2170 2175 

gag get gac aag ate age caa tct gag acc tac aga agg cgc cgt caa 6575 
Glu Ala Asp Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin 
2180 2185 2190 

gag tgg gag ate caa agg aac aac get gag gca gag ttg aag caa ate 6623 
Glu Trp Glu lie Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He 
2195 2200 2205 

gat get caa etc aag tec ttg get gtc aga agg gag get get gtc etc 6671 
Asp Ala Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu 
2210 ' 2215 2220 

cag aag acc tec etc aag acc caa cag gag caa acc cag tec cag ttg 6719 
Gin Lys Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu 
2225 2230 2235 

get ttc etc caa agg aag ttc tec aac cag get etc tac aac tgg etc 6767 
Ala Phe Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu 
2240 2245 2250 2255 

aga ggc cgc ttg get gee ate tac ttc caa ttc tac gac ctt get gtg 6815 
Arg Gly Arg Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val 
2260 2265 2270 
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gcc agg tgc etc atg get gag caa gee tac cgc tgg gag ttg aac gat 6863 
Ala Arg Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp 
2275 2280 2285 

gac tec gcc agg ttc ate aag cca ggt get tgg caa ggc acc tac get 6911 
Asp Ser Ala Arg Phe lie Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala 
2290 2295 2300 

ggt etc ctt get ggt gag acc etc atg etc tec ttg get caa atg gag 6959 
Gly Leu Leu Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu 
2305 2310 2315 

gat get cac etc aag agg gac aag agg get ttg gag gtg gag agg aca 7007 
Asp Ala His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr 
2320 2325 2330 2335 

gtc tec ctt get gag gtc tac get ggt etc cca aag gac aac ggt cca 7055 
Val Ser Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro 
2340 2345 2350 

ttc tec ctt get caa gag att gac aag ttg gtc age caa ggt tct ggt 7103 
Phe Ser Leu Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly 
2355 2360 2365 

tct get ggt tct ggt aac aac aac ttg get ttc ggc get ggt act gac 7151 
Ser Ala Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp 
2370 2375 2380 

acc aag acc tec etc caa gcc tct gtc tec ttc get gac etc aag ate 7199 
Thr Lys Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He 
2385 2390 2395 

agg gag gac tac cca get tec ctt ggc aag ate agg cgc ate aag caa 7247 
Arg Glu Asp Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin 
2400 2405 2410 2415 

ate tct gtc acc etc cca get etc ttg ggt cca tac caa gat gtc caa 7295 
He Ser Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin 
2420 2425 2430 

gca ate etc tec tac ggt gac aag get ggt ttg gcg aac ggt tgc gag 7343 
Ala He Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu 
2435 2440 2445 

get ctt get gtc tct cat ggc atg aac gac tct ggt caa ttc caa ctt 7391 
Ala Leu Ala Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu 
2450 2455 2460 

gac ttc aac gat ggc aag ttc etc cca ttc gag ggc att gcc att gac 7439 
Asp Phe Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He Asp 
2465 2470 2475 

caa ggc acc etc acc etc tec ttc cca aac get tec atg cca gag aag 7487 
Gin Gly Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys 
2480 * 2485 2490 2495 

gga aag caa gcc acc atg etc aag acc etc aac gat ate ate etc cac 7535 
Gly Lys Gin Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His 
2500 2505 2510 

ate agg tac acc ate aag tgagctcgag aggectgegg cege 7577 
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He Arg Tyr Thr He Lys 
2515 



<210> 4 

<211> 7541 

<212> DNA 

<213> Artificial Sequence 

<220> 

<221> CDS 

<222> (3) . . (7517) 

<220> 

<223> Description of Artificial Sequence rhemicot tcbA 
<400> 4 

cc atg get cag aac tec etc age tec acc att gac ace ate tgc cag 47 
Met Ala Gin Asn Ser Leu Ser Ser Thr He Asp Thr He Cys Gin 
15 10 15 

aag ctt caa etc acc tgc cca get gag ate gec etc tac cca ttc gac 95 
Lys Leu Gin Leu Thr Cys Pro Ala Glu He Ala Leu Tyr Pro Phe Asp 
20 25 30 

acc ttc cgt gag aag acc aga ggc atg gtc aac tgg ggt gag gee aag 143 
Thr Phe Arg Glu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys 
35 40 45 

agg ate tac gag att get caa get gag caa gac agg aac etc ctt cat 191 
Arg He Tyr Glu lie Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His 
50 55 60 

gag aag agg ate ttc gee tac get aac cca ttg etc aag aac get gtc 239 
Glu Lys Arg lie Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val 
65 70 75 

agg ctt ggt acc agg caa atg ttg ggt ttc ate caa ggt tac tct gac 287 
Arg Leu Gly Thr Arg Gin Met Leu Gly Phe He Gin Gly Tyr Ser Asp 
80 85 90 95 

ttg ttc ggc aac agg get gac aac tac gca get cct ggt tct gtt get 335 
Leu Phe Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala 
100 105 110 

age atg ttc age cca get gee tac etc act gag ttg tac cgt gag gec 383 
Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala 
115 120 125 

aag aac etc cat gac age tec age ate tac tac ctt gac aag agg cgc 431 
Lys Asn Leu His Asp Ser Ser Ser lie Tyr Tyr Leu Asp Lys Arg Arg 
130 135 140 

cca gac ctt get tec ttg atg etc tec cag aag aac atg gat gag gag 479 
Pro Asp Leu Ala Ser Leu Met Leu Ser Gin Lys Asn Met Asp Glu Glu 
145 150 155 

ate age acc ttg get etc tec aac gag ctt tgc ttg get ggc att gag 527 
lie Ser Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly lie Glu 
160 165 170 175 
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acc aag act ggc aag tec caa gat gag gtc atg gac atg etc tec acc 575 

Thr Lys Thr Gly Lys Ser Gin Asp Glu Val Met Asp M t Leu Ser Thr 
180 185 190 

tac cgc etc tct ggt gag act cca tac cac cat get tac gag act gtc 623 

Tyr Arg Leu Ser Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Val 
195 200 205 

agg gag att gtc cat gag agg gac cca ggt ttc cgc cac etc tec caa 671 

Arg Glu lie Val His Glu Arq Asp Pro Gly Phe Arg His Leu Ser Gin 
210 215 220 

get ccc att gtg get gec aag ttg gac cca gtc acc etc ttg ggc ate 719 

Ala Pro lie Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly lie 
225 230 235 

tec age cac ate age cca gag ttg tac aac ctt etc att gag gag ate 767 

Ser Ser His lie Ser Pro Glu Leu Tyr Asn Leu Leu He Glu Glu He 

240 245 250 255 

cca gag aag gat gag gca get ttg gac acc etc tac aag acc aac ttc 815 

Pro Glu Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe 
260 265 270 

ggt gac ate acc act get caa etc atg age cca tec tac ttg gee agg 863 

Gly Asp He Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg 
275 280 285 

tac tac ggt gtc tct cca gag gac att get tac gtc acc aca age etc 911 

Tyr Tyr Gly Val Ser Pro Glu Asp lie Ala Tyr Val Thr Thr Ser Leu 
290 295 300 

tec cat gtg ggt tac tec tct gac ate ctt gtc ate cca etc gtg gat 959 

Ser His Val Gly Tyr Ser Ser Asp He Leu Val He Pro Leu Val Asp 
305 310 315 

ggt gtg ggc aag atg gag gtt gtc agg gtc acc agg act cca tct gac 1007 

Gly Val Gly Lys Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp 

320 325 330 335 

aac tac acc tec cag acc aac tac att gag ttg tac cca caa ggt ggt 1055 

Asn Tyr Thr Ser Gin Thr Asn Tyr He Glu Leu Tyr Pro Gin Gly Gly 
340 345 350 

gac aac tac etc ate aag tac aac etc tec aac tct ttc ggt ttg gat 1103 

Asp Asn Tyr Leu He Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp 
355 360 365 

gac ttc tac etc cag tac aag gat ggt tct get gac tgg act gag att 1151 

Asp Phe Tyr Leu Gin Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu He 
370 375 380 

get cac aac cca tac cca gac atg gtc ate aac cag aag tac gag tec 1199 

Ala His Asn Pro Tyr Pro Asp Met Val He Asn Gin Lys Tyr Glu Ser 
385 390 395 

caa gee acc ate aag aga tct gac tct gac aac ate etc tec att ggt 1247 

Gin Ala Thr He Lys Arg Ser Asp Ser Asp Asn He L u Ser He Gly 

400 405 410 415 

etc caa agg tgg cac tct ggt tec tac aac ttc get get gee aac ttc 1295 
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Leu Gin Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe 
420 425 430 

aag att gac caa tac tct cca aag get ttc etc ttg aag atg aac aag 1343 
Lys lie Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys 
435 440 445 

gec ate agg etc ttg aag gee act ggt etc tec ttc gec ace ctt gag 1391 
Ala lie Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu 
450 455 460 

agg att gtg gac tct gtc aac tec acc aag tec ate act gtg gag gtc 1439 
Arg lie Val Asp Ser Val Asn Ser Thr Lys Ser He Thr Val Glu Val 
465 470 475 

etc aac aag gtc tac aga gtc aag ttc tac att gac cgc tac ggc ate 1487 
Leu Asn Lys Val Tyr Arg Val Lys Phe Tyr He Asp Arg Tyr Gly He 
480 485 490 495 

tct gag gag act get gec ate ctt gee aac ate aac ate tec cag caa 1535 
Ser Glu Glu Thr Ala Ala He Leu Ala Asn lie Asn He Ser Gin Gin 
500 505 510 

get gtc ggc aac cag etc tec caa ttc gag caa etc ttc aac cac cct 1583 
Ala Val Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro 
515 520 525 

cca etc aac ggc ate cgc tac gag ate age gag gac aac tec aag cac 1631 
Pro Leu Asn Gly He Arg Tyr Glu He Ser Glu Asp Asn Ser Lys His 
530 535 540 

etc cca aac cca gac etc aac etc aag cca gac tec act ggt gat gac 1679 
Leu Pro Asn Pro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp 
545 550 555 

caa agg aag get gtc etc aag agg get ttc caa gtc aac get tct gag 1727 
Gin Arg Lys Ala Val Leu Lys Arg Ala Phe Gin Val Asn Ala Ser Glu 
560 565 570 575 

ctt tac caa atg etc ttg ate act gac agg aag gag gat ggt gtc ate 1775 
Leu Tyr Gin Met Leu Leu He Thr Asp Arg Lys Glu Asp Gly Val He . 

580 585 590 

aag aac aac ttg gag aac etc tct gac etc tac ctt gtc tec etc ttg 1823 
Lys Asn Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Val Ser Leu Leu 
595 600 605 

gee caa ate cac aac ttg acc att get gag ttg aac ate etc ttg gtc 1871 
Ala Gin He His Asn Leu Thr He Ala Glu Leu Asn He Leu Leu Val 
610 615 620 

ate tgc ggt tac ggt gac acc aac ate tac caa ate act gac gac aac 1919 
He Cys Gly Tyr Gly Asp Thr Asn He Tyr Gin He Thr Asp Asp Asn 
625 630 635 

ctt gee aag att gtg gag acc etc ttg tgg ate acc caa tgg etc aag 1967 
Leu Ala Lys He Val Glu Thr Leu Leu Trp He Thr Gin Trp Leu Lys 
640 645 650 655 

acc cag aag tgg act gtc aca gac etc ttc etc atg acc act gee acc 2015 
Thr Gin Lys Trp Thr Val Thr Asp Leu Ph Leu Met Thr Thr Ala Thr 
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660 665 670 

tac tec acc act etc act cca gag att tec aac etc act gec acc etc 2063 
Tyr Ser Thr Thr Leu Thr Pro Glu lie Ser Asn Leu Thr Ala Thr Leu 
675 680 685 

age tec acc etc cac ggc aag gag tec etc att ggt gag gac etc aag 2111 
Ser Ser Thr Leu His Gly Lys Glu Ser Leu lie Gly Glu Asp Leu Lys 
690 695 700 

agg gca atg get cca tgc ttc acc tct get etc cac etc acc tec caa 2159 
Arg Ala Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gin 
705 710 715 

gag gtg get tac gac etc ctt etc tgg att gac caa ate caa cca get 2207 
Glu Val Ala Tyr Asp Leu Leu Leu Trp lie Asp Gin He Gin Pro Ala 
720 725 730 735 

caa ate act gtg gat ggt ttc tgg gag gag gtc caa acc act cca acc 2255 
Gin He Thr Val Asp Gly Phe Trp Glu Glu Val Gin Thr Thr Pro Thr 
740 745 750 

tec etc aag gtc ate acc ttc get caa gtc ttg get caa etc tec etc 2303 
Ser Leu Lys Val He Thr Phe Ala Gin Val Leu Ala Gin Leu Ser Leu 
755 760 765 

ate tac aga agg att ggt etc tct gag act gag ttg tec etc att gtc 2351 
He Tyr Arg Arg He Gly Leu Ser Glu Thr Glu Leu Ser Leu He Val 
770 775 780 

acc caa tec age etc ttg gtc get ggc aag tec ate ctt gat cat ggt 2399 
Thr Gin Ser Ser Leu Leu Val Ala Gly Lys Ser He Leu Asp His Gly 
785 790 795 

etc ttg acc etc atg get ctt gag ggt ttc cac acc tgg gtc aac ggt 2447 
Leu Leu Thr Leu Met Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly 
800 805 810 815 

ttg ggt caa cat get tec etc ate ttg get gca etc aag gat ggt get 2495 
Leu Gly Gin His Ala Ser Leu He Leu Ala Ala Leu Lys Asp Gly Ala 
820 825 830 

etc acc gtc acc gat gtg get caa gee atg aac aag gag gag tec etc 2543 
Leu Thr Val Thr Asp Val Ala Gin Ala Met Asn Lys Glu Glu Ser Leu 
835 * 840 845 

ttg caa atg get gee aac cag gtg gag aag gac etc acc aag etc acc 2591 
Leu Gin Met Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys Leu Thr 
850 855 860 

tec tgg acc caa ate gat gee ate etc caa tgg etc caa atg tec tct 2639 
Ser Trp Thr Gin lie Asp Ala He Leu Gin Trp Leu Gin Met Ser Ser 
865 870 875 

get ctt get gtc age cca ttg gac ctt get ggc atg atg get etc aag 2687 
Ala Leu Ala Val Ser Pro Leu Asp Leu Ala Gly Met Met Ala Leu Lys 
880 885 890 895 

tac ggc att gat cac aac tac get gee tgg caa gca get gee get gec 2735 
Tyr Gly He Ast> His Asn Tyr Ala Ala Trp Gin Ala Ala Ala Ala Ala 
900 905 910 
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etc atg get gac cat gee aac cag get cag aag aag ttg gat gag acc 2783 

Leu Met Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr 

915 920 925 

ttc tec aag get etc tgc aac tac tac ate aac gee gtg gtt gac tct 2831 

Phe Ser Lys Ala Leu Cys Asn Tyr Tyr He Asn Ala Val Val Asp Ser 

930 935 940 

get gec ggt gtc agg gac agg aac ggt etc tac acc tac etc ttg att 2879 

Ala Ala Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu He 

945 950 955 

gac aac cag gtc tct get gat gtc ate acc tec aga att get gag gee 2927 

Asp Asn Gin Val Ser Ala Asp Val He Thr Ser Arg He Ala Glu Ala 

960 965 970 975 

att get ggc ate caa etc tac gtc aac agg get etc aac agg gat gag 2975 

He Ala Gly He Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu 
980 985 990 

ggt cag ttg get tct gat gtc tec acc agg caa ttc ttc acc gac tgg 3023 

Gly Gin Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp 

995 1000 1005 

gag agg tac aac aag agg tac tec acc tgg get ggt gtc tct gag ttg 3071 

Glu Arg Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu 

1010 1015 1020 

gtc tac tac cca gag aac tac gtg gac cca acc caa agg att ggt cag 3119 

Val Tyr Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg He Gly Gin 

1025 1030 1035 

acc aag atg atg gat get ttg etc caa tec ate aac cag tec caa etc 3167 

Thr Lys Met Met Asp Ala Leu Leu Gin Ser He Asn Gin Ser Gin Leu 

1040 1045 1050 1055 

aac get gac act gtg gag gat get ttc aag acc tac etc acc tec ttc 3215 

Asn Ala Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe 
1060 1065 1070 

gag caa gtg gee aac etc aag gtc ate tct get tac cat gac aac gtc 3263 

Glu Gin Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn Val 

1075 1080 1085 

aac gtg gac caa ggt etc acc tac ttc att ggc att gac caa gec get 3311 

Asn Val Asp Gin Gly Leu Thr Tyr Phe He Gly lie Asp Gin Ala Ala 

1090 1095 1100 

cct ggc acc tac tac tgg agg tct gtg gac cac tec aag tgc gag aac 3359 

Pro Gly Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn 

1105 1110 1115 

ggc aag ttc get gec aac get tgg ggt gag tgg aac aag ate acc tgc 3407 

Gly Lys Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys He Thr Cys 

1120 1125 1130 1135 

get gtc aac cct tgg aag aac ate ate agg cca gtg gtc tac atg tec 3455 

Ala Val Asn Pro Trp Lys Asn lie He Arg Pro Val Val Tyr Met Ser 
1140 1145 1150 
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aga etc tac ttg etc tgg ctt gag caa cag tec aag aag tct gat gac 3503 
Arg Leu Tyr Leu L u Trp L u Glu Gin Gin Ser Lys Lys S r Asp Asp 
1155 1160 1165 

ggc aag aca act ate tac cag tac aac etc aag ttg get cac ate cgc 3551 
Gly Lys Thr Thr He Tyr Gin Tyr Asn Leu Lys Leu Ala His He Arg 
1170 1175 1180 

tac gat ggt tec tgg aac act cca ttc acc ttc gat gtc act gag aag 3599 
Tyr Asp Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys 
1185 1190 1195 

gtc aag aac tac acc tec age act gat gca get gag tec ctt ggt etc 3647 
Val Lys Asn Tyr Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu 
1200 1205 1210 1215 

tac tgc act ggt tac caa ggt gag gac acc etc ttg gtc atg ttc tac 3695 
Tyr Cys Thr Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met Phe Tyr 
1220 1225 1230 

tec atg caa tec age tac tec age tac act gac aac aac get cca gtc 3743 
Ser Met Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val 
1235 1240 1245 

act ggt etc tac ate ttc get gac atg tec tct gac aac atg acc aac 3791 
Thr Gly Leu Tyr He Phe Ala Asp Met Ser Ser Asp Asn Met Thr Asn 
1250 1255 1260 

get caa gee acc aac tac tgg aac aac tec tac cca caa ttc gac act 3839 
Ala Gin Ala Thr Asn Tyr Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr 
1265 1270 1275 

gtc atg get gac cca gac tct gac aac aag aag gtc ate acc agg cgt 3887 
Val Met Ala Asp Pro Asp Ser Asp Asn Lys Lys Val He Thr Arg Arg 
1280 1285 1290 1295 

gtc aac aac cgc tac get gag gac tac gag ate cca age tct gtc acc 3935 
Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Thr 
1300 1305 1310 

tec aac age aac tac tec tgg ggt gac cac tec etc acc atg etc tac 3983 
Ser Asn Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr 
1315 1320 1325 

ggt ggc tct gtc cca aac ate acc ttc gag tct gca get gag gac etc 4031 
Gly Gly Ser Val Pro Asn He Thr Phe Glu Ser Ala Ala Glu Asp Leu 
1330 1335 1340 

agg etc tec acc aac atg get etc tec ate att cac aac ggt tac get 4079 
Arg Leu Ser Thr Asn Met Ala Leu Ser lie He His Asn Gly Tyr Ala 
1345 1350 1355 

ggc acc agg cgc ate caa tgc aac etc atg aag caa tac get tec ctt 4127 
Gly Thr Arg Arg He Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu 
1360 1365 1370 1375 

ggt gac aag ttc att ate tac gac tec age ttc gat gac gee aac agg 4175 
Gly Asp Lys Phe He He Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg 
1380 1385 1390 

ttc aac ttg gtc cca etc ttc aag ttc ggc aag gat gag aac tct gat 4223 
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Phe Asn Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Glu Asn Ser Asp 
1395 1400 1405 

gac tec ate tgc ate tac aac gag aac cca age tet gag gac aag aag 4271 
Asp Ser lie Cys He Tyr Asn Glu Asn Pro S r Ser Glu Asp Lys Lys 
1410 1415 1420 

tgg tac ttc age tec aag gac gac aac aag act get gac tac aac ggt 4319 
Trp Tyr Phe Ser Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Gly 
1425 1430 1435 

ggc acc caa tgc att gat get ggc acc tec aac aag gac ttc tac tac 4367 
Gly Thr Gin Cys He Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr 
1440 1445 1450 1455 

aac etc caa gag att gag gtc ate tct gtc act ggt ggc tac tgg tec 4415 
Asn Leu Gin Glu He Glu Val He Ser Val Thr Gly Gly Tyr Trp Ser 
1460 1465 1470 

age tac aag ate age aac ccc ate aac ate aac act ggc att gac tct 4463 
Ser Tyr Lys He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser 
1475 1480 1485 

gee aag gtc aag gtc act gtc aag get ggt ggc gat gac caa ate ttc 4511 
Ala Lys Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin He Phe 
1490 1495 1500 

act get gac aac tec acc tac gtc cca cag caa cct get cca tec ttc 4559 
Thr Ala Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe 
1505 1510 1515 

gag gag atg ate tac caa ttc aac aac etc acc att gac tgc aag aac 4607 
Glu Glu Met He Tyr Gin Phe Asn Asn Leu Thr He Asp Cys Lys Asn 
1520 1525 1530 1535 

etc aac ttc att gac aac cag get cac att gag att gac ttc act gee 4655 
Leu Asn Phe He Asp Asn Gin Ala His He Glu He Asp Phe Thr Ala 
1540 1545 1550 

aca get caa gat ggc cgc ttc ttg ggt get gag acc ttc ate att cca 4703 
Thr Ala Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe lie He Pro 
1555 1560 1565 

gtc acc aag aag gtc ctt ggc act gag aac gtc att get etc tac tct 4751 
Val Thr Lys Lys Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser 
1570 1575 1580 

gag aac aac ggt gtc eag tac atg caa att ggt get tac aga ace agg 4799 
Glu Asn Asn Gly Val Gin Tyr Met Gin He Gly Ala Tyr Arg Thr Arg 
1585 1590 1595 

etc aac acc etc ttc get caa cag ttg gtc tec cgt gee aac aga ggc 4847 
Leu Asn Thr Leu Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Gly 
1600 1605 1610 1615 

att gat get gtc etc age atg gag act cag aac ate caa gag cca caa 4895 
He Asp Ala Val Leu Ser Met Glu Thr Gin Asn lie Gin Glu Pro Gin 
1620 1625 1630 

ctt ggt get ggc acc tac gtc caa ctt gtc ttg gac aag tac gat gag 4943 
Leu Gly Ala Gly Thr Tyr Val Gin Leu Val L u Asp Lys Tyr Asp Glu 
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1635 1640 1645 

tec att cat ggc acc aac aag tec ttc gec att gag tac gtg gac ate 4991 
Ser He His Gly Thr Asn Lys Ser Phe Ala He Glu Tyr Val Asp He 
1650 1655 1660 

ttc aag gag aac gac tec ttc gtc ate tac caa ggt gag ttg tct gag 5039 
Phe Lys Glu Asn Asp Ser Phe Val He Tyr Gin Gly Glu Leu Ser Glu 
1665 1670 1675 

acc tec caa act gtg gtc aag gtc ttc etc tec tac ttc att gag gee 5087 
Thr Ser Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe He Glu Ala 
1680 1685 1690 1695 

acc ggt aac aag aac cac etc tgg gtc agg gec aag tac cag aag gag 5135 
Thr Gly Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Glu 
1700 1705 1710 

acc act gac aag ate etc ttc gac agg act gat gag aag gac cca cat 5183 
Thr Thr Asp Lys He Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His 
1715 1720 1725 

ggt tgg ttc etc tct gat gac cac aag acc ttc tct ggt etc age tct 5231 
Gly Trp Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser 
1730 1735 1740 

get caa get etc aag aac gac tct gag cca atg gac ttc tct ggt gee 5279 
Ala Gin Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Ala 
1745 1750 1755 

aac get etc tac ttc tgg gag ttg ttc tac tac act cca atg atg atg 5327 
Asn Ala Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met Met 
1760 1765 1770 1775 

get cac agg etc ctt caa gag cag aac ttc gat get gee aac cac tgg 5375 
Ala His Arg Leu Leu Gin Glu Gin Asn Phe Asp Ala Ala Asn His Trp 
1780 1785 1790 

ttc cgc tac gtc tgg age cca tct ggt tac att gtg gat ggc aag att 5423 
Phe Arg Tyr Val Trp Ser Pro Ser Gly Tyr He Val Asp Gly Lys He 
1795 1800 1805 

gee ate tac cac tgg aac gtc agg cca ttg gag gag gac acc tec tgg 5471 
Ala He Tyr His Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp 
1810 1815 1820 

aac get cag caa ctt gac tec act gac cca gat get gtg get caa gat 5519 
Asn Ala Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp 
1825 1830 1835 

gac cca atg cac tac aag gtg gee acc ttc atg gec acc ttg gac ctt 5567 
Asp Pro Met His Tyr Lys Val Ala Thr Phe Met Ala Thr Leu Asp Leu 
1840 1845 1850 1855 

etc atg gec aga ggt gat get gee tac cgc caa ttg gag agg gac acc 5615 
Leu Met Ala Arg Gly Asp Ala Ala Tyr Arg Gin Leu Glu Arg Asp Thr 
1860 1865 1870 

ttg get gag gec aag atg tgg tac acc caa get etc aac ttg ctg ggt 5663 
Leu Ala Glu Ala Lys Met Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly 
1875 " 1880 1885 
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gat gag cca caa gtc atg etc tec aca acc tgg gec aac cca acc ttg 5711 
Asp Glu Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu 
1890 1895 1900 

ggc aac get gec tec aag acc aca caa cag gtc agg caa cag gtc etc 5759 
Gly Asn Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu 
1905 1910 1915 

acc caa etc agg etc aac tct aga gtc aag act cca etc ttg ggc act 5807 
Thr Gin Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr 
1920 1925 1930 1935 

gec aac tec etc act get etc ttc etc cca caa gag aac tec aaa ctt 5855 
Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu 
1940 1945 1950 

aag ggt tac tgg agg acc ctt get caa cgc atg ttc aac etc agg cac 5903 
Lys Gly Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn Leu Arg His 
1955 1960 1965 

aac etc tec att gat ggt caa cca etc tec ttg cca etc tac get aag 5951 
Asn Leu Ser lie Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys 
1970 1975 1980 

cca get gac cca aag get etc ctt tec get get gtc tec gca tec caa 5999 
Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin 
1985 1990 1995 

ggt ggt get gac etc cca aag get cca etc acc ate cac agg ttc cca 6047 
Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr lie His Arg Phe Pro 
2000 2005 2010 2015 

caa atg ttg gag ggt gec cgt ggt ctt gtc aac cag etc ate caa ttc 6095 
Gin Met Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu He Gin Phe 
2020 2025 2030 

ggt tec tct etc ctt ggt tac tct gag agg caa gat get gag gec atg 6143 
Gly Ser Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met 
2035 2040 2045 

tec caa etc ttg caa acc cag get tct gag ttg ate etc acc tec ate 6191 
Ser Gin Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu Thr Ser He 
2050 2055 2060 

agg atg caa gac aac cag ctt get gag ttg gac tct gag aag act get 6239 
Arg Met Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala 
2065 2070 2075 

etc caa gtc tec ctt get ggt gtc caa cag agg ttc gac age tac tec 6287 
Leu Gin Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser 
2080 2085 2090 2095 

caa etc tac gag gag aac ate aac get ggt gag caa agg get ttg get 6335 
Gin Leu Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arg Ala Leu Ala 
2100 2105 2110 

etc agg tct gag tct gec att gag tec caa ggt get caa ate tec cgc 6383 
Leu Arg Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin He Ser Arg 
2115 2120 2125 
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atg get ggt get ggc gtg gac atg get cca aac ate ttc ggt ctt get 6431 
Met Ala Gly Ala Gly Val Asp Met Ala Pro Asn lie Phe Gly Leu Ala 
2130 2135 2140 

gat ggt ggc atg cac tac ggt gee att get tac gee att get gat ggc 6479 
Asp Gly Gly Met His Tyr Gly Ala lie Ala Tyr Ala He Ala Asp Gly 
2145 2150 2155 

att gag ctt tct get tct gec aag atg gtt gat get gag aag gtg get 6527 
He Glu Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala 
2160 2165 2170 2175 

caa tct gaa ate tac cgt cgc aga cgc caa gaa tgg aag ate caa agg 6575 
Gin Ser Glu He Tyr Arg Arg Arcj Arg Gin Glu Trp Lys He Gin Arg 
2180 2185 2190 

gac aac get caa get gag ate aac cag etc aac get caa ctt gag tec 6623 
Asp Asn Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin Leu Glu Ser 
2195 2200 2205 

etc age ate agg cgt gag get get gag atg cag aag gag tac etc aag 6671 
Leu Ser He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys 
2210 2215 2220 

acc caa cag get caa get cag get caa etc ace ttc etc agg tec aag 6719 
Thr Gin Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys 
2225 2230 2235 

ttc tec aac cag get etc tac tec tgg etc aga ggc cgc etc tct ggc 6767 
Phe Ser Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly 
2240 2245 2250 2255 

ate tac ttc caa ttc tac gac ttg get gtc tec cgc tgc etc atg get 6815 
lie Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala 
2260 2265 2270 

gag caa tec tac caa tgg gag gee aac gac aac age ate tec ttc gtc 6863 
Glu Gin Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser He Ser Phe Val 
2275 2280 2285 

aag cca ggt get tgg caa ggc acc tac get ggt etc ctt tgc ggt gag 6911 
Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu 
2290 2295 2300 

get etc ate cag aac ttg get caa atg gag gag get tac etc aag tgg 6959 
Ala Leu He Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr Leu Lys Trp 
2305 2310 2315 

gag tec aga get ttg gag gta gag agg act gtc tec ctt get gta gtc 7007 
Glu Ser Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val 
2320 2325 2330 2335 

tac gac tec ttg gag ggc aac gac agg ttc aac ctt get gag caa ate 7055 
Tyr Asp Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin He 
2340 2345 2350 

cca get etc ttg gac aag ggt gag ggc act get ggc acc aag gag aac 7103 
Pro Ala Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn 
2355 2360 2365 

ggt etc tec ttg gee aac gee ate etc tct get tct gtc aag etc tct 7151 
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Gly Leu Ser Leu Ala Asn Ala lie Leu Ser Ala Ser Val Lys Leu Ser 
2370 2375 2380 

gac etc aag ttg ggt act gac tac cca gac tec att gtg ggt tec aac 7199 
Asp Leu Lys Leu Gly Thr Asp Tyr Pro Asp Ser lie Val Gly Ser Asn 
2385 2390 2395 

aag gtc aga agg ate aag caa ate tct gtc tec etc cca get ttg gtg 7247 
Lys Val Arg Arg lie Lys Gin lie Ser Val Ser Leu Pro Ala Leu Val 
2400 2405 2410 2415 

ggt cca tac caa gat gtc caa gec atg etc tec tac ggt ggc tec acc 7295 
Gly Pro Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr 
2420 2425 2430 

caa etc cca aag ggt tgc tct get ttg get gtc tec cac ggc acc aac 7343 
Gin Leu Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn 
2435 2440 2445 

gac tct ggt caa ttc caa ctt gac ttc aac gat ggc aag tac etc cca 7391 
Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro 
2450 2455 2460 

ttc gaa ggc att get ttg gat gac caa ggc acc etc aac etc caa ttc 7439 
Phe Glu Gly lie Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe 
2465 2470 2475 

cca aac gee act gac aag cag aag gee ate etc caa acc atg tct gac 7487 
Pro Asn Ala Thr Asp Lys Gin Lys Ala lie Leu Gin Thr Met Ser Asp 
2480 2485 2490 2495 

ate ate etc cac ate agg tac acc ate agg tgagctcgag aggectgegg 7537 
lie He Leu His He Arg Tyr Thr He Arg 
2500 2505 

cege 7541 



<210> 5 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : hemicot sequence 
encoding ER signal from 15 kDa zein from Black 
Mexican Sweet maize 

<220> 
<221> CDS 
<222> (1) . . (63) 

<400> 5 

atg get aag atg gtc att gtg ctt gtg gtc tgc ttg get etc tct get 48 
Met Ala Lys Met Val He Val Leu Val Val Cys Leu Ala Leu Ser Ala 
15 10 15 

gee tgt get tea gee 63 
Ala Cys Ala Ser Ala 
20 
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<210> 6 
<211> 7621 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : hemicot tcdA 
fused to the modified 15 kDa zein endoplasmic 
reticulum signal peptide 

<220> 

<221> CDS 

<222> (4) (7614) 

<400> 6 

ncc atg get aag atg gtc att gtg ctt gtg gtc tgc ttg get etc tct 48 
Met Ala Lys Met Val lie Val Leu Val Val Cys Leu Ala Leu Ser 
15 10 15 

get gec tgt get tea gee atg aac gag tec gtc aag gag ate cca gac 96 
Ala Ala Cys Ala Ser Ala Met Asn Glu Ser Val Lys Glu lie Pro Asp 
20 25 30 

gtc etc aag tec caa tgc ggt ttc aac tgc etc act gac ate tec cac 144 
Val Leu Lys Ser Gin Cys Gly Phe Asn Cys Leu Thr Asp lie Ser His 
35 40 45 

age tec ttc aac gag ttc aga caa caa gtc tct gag cac etc tec tgg 192 
Ser Ser Phe Asn Glu Phe Arg Gin Gin Val Ser Glu His Leu Ser Trp 
50 55 60 

tec gag ace cat gac etc tac cat gac get cag caa get cag aag gac 240 
Ser Glu Thr His Asp Leu Tyr His Asp Ala Gin Gin Ala Gin Lys Asp 
65 70 75 

aac agg etc tac gag get agg ate etc aag agg get aac cca caa etc 288 
Asn Arg Leu Tyr Glu Ala Arg lie Leu Lys Arg Ala Asn Pro Gin Leu 
80 85 90 95 

cag aac get gtc cac etc gec ate ttg get cca aac get gag ttg att 336 
Gin Asn Ala Val His Leu Ala He Leu Ala Pro Asn Ala Glu Leu He 
100 105 HO 

ggt tac aac aac cag ttc tct ggc aga get age cag tac gtg get cct 384 
Gly Tyr Asn Asn Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val Ala Pro 
115 120 125 

ggt aca gtc tec tec atg ttc age cca gee get tac etc act gag ttg 432 
Gly Thr Val Ser Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu 
130 135 140 

tac cgc gag get agg aac ctt cat get tct gac tec gtc tac tac ttg 480 
Tyr Arg Glu Ala Arg Asn Leu His Ala Ser Asp Ser Val Tyr Tyr Leu 
145 150 155 

gac aca cgc aga cca gac etc aag age atg gec etc age caa cag aac 528 
Asp Thr Arg Arg Pro Asp Leu Lys Ser Met Ala Leu Ser Gin Gin Asn 
160 165 170 175 

atg gac att gag ttg tec ace etc tec ttg age aac gag ctt etc ttg 576 
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Met Asp lie Glu Leu Ser Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu 
180 185 190 

gag tec ate aag act gag age aag ttg gag aac tac acc aag gtc atg 624 

Glu Ser lie Lys Thr Glu Ser Lys Leu Glu Asn Tyr Thr Lys Val Met 

195 200 205 

gag atg etc tec acc ttc aga cca age ggt gca act cca tac cat gat 672 

Glu Met Leu Ser Thr Phe Arg Pro Ser Gly Ala Thr Pro Tyr His Asp 

210 215 220 

gee tac gag aac gtc agg gag gtc ate caa ctt caa gac cct ggt ctt 720 

Ala Tyr Glu Asn Val Arg Glu Val lie Gin Leu Gin Asp Pro Gly Leu 

225 230 235 

gag caa etc aac get tct cca gec att get ggt ttg atg cac cag gca 768 

Glu Gin Leu Asn Ala Ser Pro Ala lie Ala Gly Leu Met His Gin Ala 

240 245 250 255 

tec ttg etc ggt ate aac gec tec ate tct cct gag ttg ttc aac ate 816 

Ser Leu Leu Gly lie Asn Ala Ser lie Ser Pro Glu Leu Phe Asn lie 
260 265 270 

ttg act gag gag ate act gag ggc aac get gag gag ttg tac aag aag 864 

Leu Thr Glu Glu lie Thr Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys 

275 280 285 

aac ttc ggc aac att gag cca gec tct ctt gca atg cct gag tac etc 912 

Asn Phe Gly Asn lie Glu Pro Ala Ser Leu Ala Met Pro Glu Tyr Leu 

290 295 300 

aag agg tac tac aac ttg tct gat gag gag ctt tct caa ttc att ggc 960 

Lys Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe lie Gly 

305 310 315 

aag get tec aac ttc ggt caa cag gag tac age aac aac cag etc ate 1008 

Lys Ala Ser Asn Phe Gly Gin Gin Glu Tyr Ser Asn Asn Gin Leu He 

320 325 330 335 

act cca gtt gtg aac tec tct gat ggc act gtg aag gtc tac cgc ate 1056 

Thr Pro Val Val Asn Ser Ser Asp Gly Thr Val Lys Val Tyr Arg He 
340 345 350 

aca cgt gag tac acc aca aac gec tac caa atg gat gtt gag ttg ttc 1104 

Thr Arg Glu Tyr Thr Thr Asn Ala Tyr Gin Met Asp Val Glu Leu Phe 

355 360 365 

cca ttc ggt ggt gag aac tac aga ctt gac tac aag ttc aag aac ttc 1152 

Pro Phe Gly Gly Glu Asn Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe 

370 " 375 380 

tac aac gee tec tac etc tec ate aag ttg aac gac aag agg gag ctt 1200 

Tyr Asn Ala Ser Tyr Leu Ser lie Lys Leu Asn Asp Lys Arg Glu Leu 

385 390 395 

gtc agg act gag ggt get cct caa gtg aac att gag tac tct gee aac 1248 

Val Arg Thr Glu Gly Ala Pro Gin Val Asn He Glu Tyr Ser Ala Asn 

400 405 410 415 

ate acc etc aac aca get gac ate tct caa cca ttc gag att ggt ttg 1296 

He Thr Leu Asn Thr Ala Asp He Ser Gin Pro Phe Glu lie Gly Leu 
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420 425 430 

acc aga gtc ctt ccc tct ggc tec tgg gec tac get gca gee aag ttc 1344 
Thr Arg Val Leu Pro Ser Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe 
435 440 445 

act gtt gag gag tac aac cag tac tct ttc etc ttg aag etc aac aag 1392 
Thr Val Glu Glu Tyr Asn Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys 
450 455 460 

gca att cgt etc age aga gec act gag ttg tct ccc acc ate ttg gag 1440 
Ala He Arg Leu Ser Arg Ala Thr Glu Leu Ser Pro Thr He Leu Glu 
465 470 415 

ggc att gtg agg tct gtc aac ctt caa ctt gac ate aac act gat gtg 1488 
Gly He Val Arg Ser Val Asn Leu Gin Leu Asp He Asn Thr Asp Val 
480 485 490 495 

ctt ggc aag gtc ttc etc acc aag tac tac atg caa cgc tac gec ate 1536 
Leu Gly Lys Val Phe Leu Thr Lys Tyr Tyr Met Gin Arg Tyr Ala He 
500 505 510 

cat get gag act gca etc ate etc tgc aac gca ccc ate tct caa cgc 1584 
His Ala Glu Thr Ala Leu He Leu Cys Asn Ala Pro He Ser Gin Arg 
515 520 525 

tec tac gac aac cag cct tec cag ttc gac agg etc ttc aac act cct 1632 
Ser Tyr Asp Asn Gin Pro Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro 
530 535 540 

etc ttg aac ggc cag tac ttc tec act ggt gat gag gag att gac etc 1680 
Leu Leu Asn Gly Gin Tyr Phe Ser Thr Gly Asp Glu Glu lie Asp Leu 
545 * 550 555 

aac tct ggc tec aca ggt gac tgg aga aag acc ate ttg aag agg gec 1728 
Asn Ser Gly Ser Thr Gly Asp Trp Arg Lys Thr He Leu Lys Arg Ala 
560 565 570 575 

ttc aac att gat gat gtc tct etc ttc cgt etc ttg aag ate aca gat 1776 
Phe Asn He Asp Asp Val Ser Leu Phe Arg Leu Leu Lys lie Thr Asp 
580 585 590 

cac gac aac aag gat ggc aag ate aag aac aac ttg aag aac ctt tec 1824 
His Asp Asn Lys Asp Gly Lys lie Lys Asn Asn Leu Lys Asn Leu Ser 
595 600 605 

aac etc tac att ggc aag ttg ctt gca gac ate cac caa etc acc att 1872 
Asn Leu Tyr lie Gly Lys Leu Leu Ala Asp lie His Gin Leu Thr lie 
610 615 620 

gat gag ttg gac etc ttg etc att gca gtc ggt gag ggc aag acc aac 1920 
Asp Glu Leu Asp Leu Leu Leu lie Ala Val Gly Glu Gly Lys Thr Asn 
625 630 635 

etc tct gca ate tct gac aag cag ttg gca acc etc ate agg aag ttg 1968 
Leu Ser Ala lie Ser Asp Lys Gin Leu Ala Thr Leu lie Arg Lys Leu 
640 645 650 655 

aac acc ate acc tec tgg ctt cac acc cag aag tgg tct gtc ttc caa 2016 
Asn Thr He Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val Phe Gin 
660 665 670 
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etc ttc ate atg ace age ace tec tac aac aag ace etc act cct gag 2064 

Leu Phe He Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr Pro Glu 
675 680 685 

ate aag aac etc ttg gac aca gtc tac cac ggt etc caa ggc ttc gac 2112 

He Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly Phe Asp 
690 695 700 

aag gac aag get gac ttg ctt cat gtc atg get ccc tac att gca gee 2160 

Lys Asp Lys Ala Asp Leu Leu His Val Met Ala Pro Tyr He Ala Ala 
705 710 715 

ace etc caa etc tec tct gag aac gtg get cac tct gtc ttg etc tgg 2208 

Thr Leu Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Leu Leu Trp 

720 725 730 735 

get gac aag etc caa cct ggt gat ggt gec atg act get gag aag ttc 2256 

Ala Asp Lys Leu Gin Pro Gly Asp Gly Ala Met Thr Ala Glu Lys Phe 
740 745 750 

tgg gac tgg etc aac ace aag tac aca cca ggc tec tct gag get gtt 2304 

Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val 
755 760 765 

gag act caa gag cac att gtg caa tac tgc cag get ctt gca cag ttg 2352 

Glu Thr Gin Glu His He Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu 
770 775 780 

gag atg gtc tac cac tec act ggc ate aac gag aac get ttc aga etc 2400 

Glu Met Val Tyr His Ser Thr Gly He Asn Glu Asn Ala Phe Arg Leu 
785 790 795 

ttc gtc acc aag cct gag atg ttc ggt get gee aca ggt get gca cct 2448 

Phe Val Thr Lys Pro Glu Met Phe Gly Ala Ala Thr Gly Ala Ala Pro 

800 805 810 815 

get cat gat get etc tec etc ate atg ttg acc agg ttc get gac tgg 24 96 

Ala His Asp Ala Leu Ser Leu He Met Leu Thr Arg Phe Ala Asp Trp 
820 825 830 

gtc aac get ctt ggt gag aag get tec tct gtc ttg get gee ttc gag 2544 
Val Asn Ala Leu Gly Glu Lys Ala Ser Ser Val Leu Ala Ala Phe Glu 
835 840 845 

gee aac tec etc act get gag caa ctt get gat gee atg aac ctt gat 2592 

Ala Asn Ser Leu Thr Ala Glu Gin Leu Ala Asp Ala Met Asn Leu Asp 
850 855 860 

gee aac etc ttg etc caa get tec att caa get cag aac cac caa cac 2640 
Ala Asn Leu Leu Leu Gin Ala Ser He Gin Ala Gin Asn His Gin His 
865 870 875 

etc cca cct gtc act cca gag aac get ttc tec tgc tgg acc tec ate 2688 

Leu Pro Pro Val Thr Pro Glu Asn Ala Phe Ser Cys Trp Thr Ser He 

880 885 890 895 

aac acc ate etc caa tgg gtc aac gtg get cag caa etc aac gtg get 2736 

Asn Thr He Leu Gin Trp Val Asn Val Ala Gin Gin Leu Asn Val Ala 
900 905 910 
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cca caa ggt gtc tct get ttg gtc ggt ctt gac tac ate cag tec atg 2784 
Pro Gin Gly Val Ser Ala Leu Val Gly Leu Asp Tyr He Gin Ser Met 
915 920 925 

aag gag aca cca ace tac get caa tgg gag aac gca get ggt gtc ttg 2832 
Lys Glu Thr Pro Thr Tyr Ala Gin Trp Glu Asn Ala Ala Gly Val Leu 
930 935 940 

act get ggt etc aac tec caa cag gec aac ace etc cat get ttc ttg 2880 
Thr Ala Gly Leu Asn Ser Gin Gin Ala Asn Thr Leu His Ala Phe Leu 
945 950 955 

gat gag tct cgc tct get gec etc tec acc tac tac ate agg caa gtc 2928 
Asp Glu Ser Arg Ser Ala Ala Leu Ser Thr Tyr Tyr He Arg Gin Val 
960 965 970 975 

gec aag gca get get gec ate aag tct cgc gat gac etc tac caa tac 2976 
Ala Lys Ala Ala Ala Ala He Lys Ser Arg Asp Asp Leu Tyr Gin Tyr 
980 985 990 

etc etc att gac aac cag gtc tct get gec ate aag acc acc agg ate 3024 
Leu Leu He Asp Asn Gin Val Ser Ala Ala He Lys Thr Thr Arg He 
995 1000 * 1005 

get gag gec ate get tec ate caa etc tac gtc aac cgc get ctt gag 3072 
Ala Glu Ala lie Ala Ser He Gin Leu Tyr Val Asn Arg Ala Leu Glu 
1010 1015 1020 

aac gtt gag gag aac gec aac tct ggt gtc ate tct cgc caa ttc ttc 3120 
Asn Val Glu Glu Asn Ala Asn Ser Gly Val He Ser Arg Gin Phe Phe 
1025 1030 * 1035 

ate gac tgg gac aag tac aac aag agg tac tec acc tgg get ggt gtc 3168 
He Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val 
1040 1045 1050 1055 

tct caa ctt gtc tac tac cca gag aac tac att gac cca acc atg agg 3216 
Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr Met Arg 
1060 1065 1070 

att ggt cag acc aag atg atg gat get etc ttg caa tct gtc tec caa 3264 
lie Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val Ser Gin 
1075 1080 1085 

age caa etc aac get gac act gtg gag gat gec ttc atg age tac etc 3312 
Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Met Ser Tyr Leu 
1090 1095 1100 

acc tec ttc gag caa gtt gec aac etc aag gtc ate tct get tac cat 3360 
Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He Ser Ala Tyr His 
1105 1110 1115 

gac aac ate aac aac gac caa ggt etc acc tac ttc att ggt etc tct 3408 
Asp Asn lie Asn Asn Asp Gin Gly Leu Thr Tyr Phe lie Gly Leu Ser 
1120 1125 1130 1135 

gag act gat get ggt gag tac tac tgg aga tec gtg gac cac age aag 34 56 
Glu Thr Asp Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His Ser Lys 
1140 1145 1150 

ttc aac gat ggc aag ttc get gca aac get tgg tct gag tgg cac aag 3504 
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Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp His Lys 
1155 1160 1165 

att gac tgc cct ate aac cca tac aag tec acc ate aga cct gtc ate 3552 

He Asp Cys Pro He Asn Pro Tyr Lys Ser Thr He Arg Pro Val He 
1X70 1175 1180 

tac aag age cgc etc tac ttg etc tgg ctt gag cag aag gag ate acc 3600 

Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu He Thr 
1185 1190 1195 

aag caa act ggc aac tec aag gat ggt tac caa act gag act gac tac 3648 

Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr 
1200 1205 1210 1215 

cgc tac gag ttg aag ttg get cac ate cgc tac gat ggt acc tgg aac 3696 

Arg Tyr Glu Leu Lys Leu Ala His He Arg Tyr Asp Gly Thr Trp Asn 
1220 1225 1230 

act cca ate acc ttc gat gtc aac aag aag ate age gag ttg aag ttg 3744 

Thr Pro He Thr Phe Asp Val Asn Lys Lys lie Ser Glu Leu Lys Leu 
1235 1240 1245 

gag aag aac cgt get cct ggt etc tac tgc get ggt tac caa ggt gag 3792 

Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu 
1250 1255 1260 

gac acc etc ttg gtc atg ttc tac aac cag caa gac acc ctt gac tec 3840 

Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser 
1265 1270 1275 

tac aag aac get tec atg caa ggt etc tac ate ttc get gac atg get 3888 

Tyr Lys Asn Ala Ser Met Gin Gly Leu Tyr He Phe Ala Asp Met Ala 
1280 1285 1290 1295 

tec aag gac atg act cca gag caa age aac gtc tac cgt gac aac tec 3936 

Ser Lys Asp Met Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser 
1300 1305 1310 

tac caa cag ttc gac acc aac aac gtc agg cgt gtc aac aac aga tac 3984 

Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn Arg Tyr 
1315 1320 1325 

get gag gac tac gag ate cca age tct gtc age tct cgc aag gac tac 4032 

Ala Glu Asp Tyr Glu He Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr 
1330 1335 1340 

ggc tgg ggt gac tac tac etc age atg gtg tac aac ggt gac ate cca 4080 

Gly Trp Gly Asp Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp He Pro 
1345 1350 1355 

acc ate aac tac aag get gee tct tec gac etc aaa ate tac ate age 4128 

Thr He Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys lie Tyr lie Ser 
1360 1365 1370 1375 

cca aag etc agg ate ate cac aac ggc tac gag ggt cag aag agg aac 417 6 

Pro Lys Leu Arg He lie His Asn Gly Tyr Glu Gly Gin Lys Arg Asn 
1380 1385 1390 

cag tgc aac ttg atg aac aag tac ggc aag ttg ggt gac aag ttc att 4224 

Gin Cys Asn Leu Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys Phe lie 
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1395 1400 1405 

gtc tac acc tct ctt ggt gtc aac cca aac aac age tec aac aag etc 4272 

Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn Lys Leu 
1410 1415 1420 

atg ttc tac cca gtc tac caa tac tct ggc aac acc tct ggt etc aac 4320 

Met Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn 
1425 1430 1435 

cag ggt aga etc ttg ttc cac agg gac acc acc tac cca age aag gtg 4368 

Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser Lys Val 
1440 1445 1450 1455 

gag get tgg att cct ggt gee aag agg tec etc acc aac cag aac get 4416 

Glu Ala Trp lie Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala 
1460 1465 1470 

gee att ggt gat gac tac gec aca gac tec etc aac aag cct gat gac 4 464 

Ala lie Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp 
1475 1480 1485 

etc aag cag tac ate ttc atg act gac tec aag ggc aca gec act gat 4512 

Leu Lys Gin Tyr lie Phe Met Thr Asp Ser Lys Gly Thr Ala Thr Asp 
1490 1495 1500 

gtc tct ggt cca gtg gag ate aac act gca ate age cca gee aag gtc 4560 

Val Ser Gly Pro Val Glu lie Asn Thr Ala lie Ser Pro Ala Lys Val 
1505 1510 1515 

caa ate att gtc aag get ggt ggc aag gag caa acc ttc aca get gac 4608 

Gin lie lie Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp 
1520 1525 1530 1535 

aag gat gtc tec ate cag cca age cca tec ttc gat gag atg aac tac 4 656 

Lys Asp Val Ser lie Gin Pro Ser Pro Ser Phe Asp Glu Met Asn Tyr 
1540 1545 1550 

caa ttc aac get ctt gag att gat ggt tct ggc etc aac ttc ate aac 4704 

Gin Phe Asn Ala Leu Glu lie Asp Gly Ser Gly Leu Asn Phe He Asn 
1555 1560 1565 

aac tct get tec att gat gtc acc ttc act gec ttc get gag gat ggc 4752 

Asn Ser Ala Ser He Asp Val Thr Phe Thr Ala Phe Ala Glu Asp Gly 
1570 1575 1580 

cgc aag ttg ggt tac gag age ttc tec ate cca gtc acc ctt aag gtt 4800 

Arg Lys Leu Gly Tyr Glu Ser Phe Ser He Pro Val Thr Leu Lys Val 
1585 " 1590 1595 

tec act gac aac gca etc acc ctt cat cac aac gag aac ggt get cag 4848 

Ser Thr Asp Asn Ala Leu Thr Leu His His Asn Glu Asn Gly Ala Gin 
1600 1605 1610 1615 

tac atg caa tgg caa age tac cgc acc agg ttg aac acc etc ttc gca 4896 

Tyr Met Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala 
1620 1625 1630 

agg caa ctt gtg gec cgt gec acc aca ggc att gac acc ate etc age 4 944 

Arg Gin Leu Val Ala Arg Ala Thr Thr Gly He Asp Thr He Leu Ser 
1635 1640 1645 
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atg gag acu cag aac ate caa gag cca cag ttg ggc aag ggt ttc tac 4992 

Met Glu Thr Gin Asn lie Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr 
1650 1655 1660 

gec acc ttc gtc ate cca cct tac aac etc age act cat ggt gat gag 5040 

Ala Thr Phe Val He Pro Pro Tyr Asn Leu Ser Thr His Gly Asp Glu 
1665 1670 1675 

agg tgg ttc aag etc tac ate aag cac gtg gtt gac aac aac tec cac 5088 

Arg Trp Phe Lys Leu Tyr He Lys His Val Val Asp Asn Asn Ser His 

1680 1685 1690 1695 

ate ate tac tct ggt caa etc act gac acc aac ate aac ate acc etc 5136 

He He Tyr Ser Gly Gin Leu Thr Asp Thr Asn He Asn He Thr Leu 

1700 1705 1710 

ttc ate cca ctt gac gat gtc cca etc aac cag gac tac cat gee aag 5184 

Phe He Pro Leu Asp Asp Val Pro Leu Asn Gin Asp Tyr His Ala Lys 

1715 1720 1725 

gtc tac atg acc ttc aag aag tct cca tct gat ggc acc tgg tgg ggt 5232 

Val Tyr Met Thr Phe Lys Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly 
1730 1735 1740 

cca cac ttc gtc cgt gat gac aag ggc ate gtc acc ate aac cca aag 5280 

Pro His Phe Val Arg Asp Asp Lys Gly He Val Thr He Asn Pro Lys 
1745 1750 1755 

tec ate etc acc cac ttc gag tct gtc aac gtt etc aac aac ate tec 5328 

Ser lie Leu Thr His Phe Glu Ser Val Asn Val Leu Asn Asn He Ser 

1760 1765 1770 1775 

tct gag cca atg gac ttc tct ggt gec aac tec etc tac ttc tgg gag 537 6 

Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu 

1780 1785 1790 

ttg ttc tac tac aca cca atg ctt gtg get caa agg ttg etc cat gag 5424 

Leu Phe Tyr Tyr Thr Pro Met Leu Val Ala Gin Arg Leu Leu His Glu 

1795 1800 1805 

cag aac ttc gat gag gec aac agg tgg etc aag tac gtc tgg age cca 5472 

Gin Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro 
1810 1815 1820 

tct ggt tac att gtg cat ggt caa ate cag aac tac caa tgg aac gtc 5520 

Ser Gly Tyr lie Val His Gly Gin He Gin Asn Tyr Gin Trp Asn Val 
1825 1830 1835 

agg cca ttg ctt gag gac acc tec tgg aac tct gac cca ctt gac tct 5568 

Arg Pro Leu Leu Glu Asp Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser 

1840 1845 1850 1855 

gtg gac cct gat get gtg get caa cat gac cca atg cac tac aag gtc 5616 

Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Met His Tyr Lys Val 

1860 1865 1870 

tec acc ttc atg agg acc ttg gac etc ttg att gec aga ggt gac cat 5664 

Ser Thr Phe Met Arg Thr Leu Asp Leu Leu He Ala Arg Gly Asp His 

1875 1880 1885 
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get tac cgc caa ttg gag agg gac acc etc aac gag gca aag atg tgg 5712 
Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys Met Trp 
1890 1895 1900 

tac atg caa get etc cac etc ttg ggt gac aag cca tac etc cca etc 5760 
Tyr Met Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu 
1905 1910 1915 

age acc act tgg tec gac cca agg ttg gac cgt get get gac ate acc 5808 
Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp lie Thr 
1920 1925 1930 1935 

act cag aac get cat gac tct gee att gtt get etc agg cag aac ate 5856 
Thr Gin Asn Ala His Asp Ser Ala He Val Ala Leu Arg Gin Asn He 
1940 1945 1950 

cca act cct get cca etc tec etc aga tct get aac acc etc act gac 5904 
Pro Thr Pro Ala Pro Leu Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp 
1955 1960 1965 

ttg ttc etc cca cag ate aac gag gtc atg atg aac tac tgg caa acc 5952 
Leu Phe Leu Pro Gin He Asn Glu Val Met Met Asn Tyr Trp Gin Thr 
1970 1975 1980 

ttg get caa agg gtc tac aac etc aga cac aac etc tec att gat ggt 6000 
Leu Ala Gin Arg Val Tyr Asn Leu Arg His Asn Leu Ser He Asp Gly 
1985 1990 1995 

caa cca etc tac etc cca ate tac gee aca cca get gac cca aag get 6048 
Gin Pro Leu Tyr Leu Pro He Tyr Ala Thr Pro Ala Asp Pro Lys Ala 
2000 2005 2010 2015 

ctt etc tct get get gtg get acc age caa ggt ggt ggc aag etc cca 6096 
Leu Leu Ser Ala Ala Val Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro 
2020 2025 2030 

gag tec ttc atg tec etc tgg agg ttc cca cac atg ttg gag aac gee 6144 
Glu Ser Phe Met Ser Leu Trp Arg Phe Pro His Met Leu Glu Asn Ala 
2035 2040 2045 

cgt ggc atg gtc tec caa etc acc cag ttc ggt tec acc etc cag aac 6192 
Arg Gly Met Val Ser Gin Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn 
2050 2055 2060 

ate att gag agg caa gat get gag get etc aac get ttg etc cag aac 6240 
He He Glu Arg Gin Asp Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn 
2065 2070 2075 

cag gca get gag ttg ate etc acc aac ttg tec ate caa gac aag acc 6288 
Gin Ala Ala Glu Leu He Leu Thr Asn Leu Ser lie Gin Asp Lys Thr 
2080 2085 2090 2095 

att gag gag ctt gat get gag aag aca gtc ctt gag aag age aag get 6336 
lie Glu Glu Leu Asp Ala Glu Lys Thr Val Leu Glu Lys Ser Lys Ala 
2100 2105 2110 

ggt gee caa tct cgc ttc gac tec tac ggc aag etc tac gat gag aac 6384 
Gly Ala Gin Ser Arg Phe Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn 
2115 2120 2125 

ate aac get ggt gag aac cag gee atg acc etc agg get tec gca get 6432 
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lie Asn Ala Gly Glu Asn Gin Ala Met Thr Leu Arg Ala Ser Ala Ala 
2130 2135 2140 

ggt etc acc act get gtc caa gee tct cgc ttg get ggt gca get get 6480 

Gly Leu Thr Thr Ala Val Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala 
2145 2150 2155 

gac etc gtt cca aac ate ttc ggt ttc get ggt ggt ggc tec aga tgg 6528 

Asp Leu Val Pro Asn lie Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp 
2160 2165 2170 2175 

ggt gee att get gag get acc ggt tac gtc atg gag ttc tct gec aac 6576 

Gly Ala He Ala Glu Ala Thr Gly Tyr Val Met Glu Phe Ser Ala Asn 
2180 2185 2190 

gtc atg aac act gag get gac aag ate age caa tct gag acc tac aga 6624 

Val Met Asn Thr Glu Ala Asp Lys He Ser Gin Ser Glu Thr Tyr Arg 

2195 2200 2205 

agg cgc cgt caa gag tgg gag ate caa agg aac aac get gag gca gag 6672 

Arg Arg Arg Gin Glu Trp Glu He Gin Arg Asn Asn Ala Glu Ala Glu 
2210 2215 2220 



ttg aag caa ate gat get caa etc aag tec ttg get gtc aga agg gag 
Leu Lys Gin He Asp Ala Gin Leu Lys Ser Leu Ala Val Arg Arg Glu 
2225 2230 2235 



6720 



get get gtc etc cag aag acc tec etc aag acc caa cag gag caa acc 6768 

Ala Ala Val Leu Gin Lys Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr 
2240 2245 2250 2255 

cag tec cag ttg get ttc etc caa agg aag ttc tec aac cag get etc 6816 

Gin Ser Gin Leu Ala Phe Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu 
2260 2265 2270 

tac aac tgg etc aga ggc cgc ttg get gee ate tac ttc caa ttc tac 6864 

Tyr Asn Trp Leu Arg Gly Arg Leu Ala Ala He Tyr Phe Gin Phe Tyr 
2275 2280 2285 

gac ctt get gtg gee agg tgc etc atg get gag caa gec tac cgc tgg 6912 

Asp Leu Ala Val Ala Arg Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp 
2290 2295 2300 

gag ttg aac gat gac tec gee agg ttc ate aag cca ggt get tgg caa 6960 

Glu Leu Asn Asp Asp Ser Ala Arg Phe He Lys Pro Gly Ala Trp Gin 
2305 2310 2315 

ggc acc tac get ggt etc ctt get ggt gag acc etc atg etc tec ttg 7008 

Gly Thr Tyr Ala Gly Leu Leu Ala Gly Glu Thr Leu Met Leu Ser Leu 
2320 2325 2330 2335 

get caa atg gag gat get cac etc aag agg gac aag agg get ttg gag 7056 

Ala Gin Met Glu Asp Ala His Leu Lys Arg Asp Lys Arg Ala Leu Glu 
2340 2345 2350 

gtg gag agg aca gtc tec ctt get gag gtc tac get ggt etc cca aag 7104 

Val Glu Arg Thr Val Ser Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys 
2355 2360 2365 

gac aac ggt cca ttc tec ctt get caa gag att gac aag ttg gtc age 7152 

Asp Asn Gly Pro Phe Ser Leu Ala Gin Glu He Asp Lys Leu Val Ser 
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2370 2375 2380 

caa ggt tct ggt tct get ggt tct ggt aac aac aac ttg get ttc ggc 7200 
Gin Gly Ser Gly Ser Ala Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly 
2385 2390 2395 

get ggt act gac acc aag ace tec etc caa gee tct gtc tec ttc get 7248 
Ala Gly Thr Asp Thr Lys Thr Ser Leu Gin Ala Ser Val Ser Phe Ala 
2400 2405 2410 2415 

gac etc aag ate agg gag gac tac cca get tec ctt ggc aag ate agg 7296 
Asp Leu Lys lie Arg Glu Asp Tyr Pro Ala Ser Leu Gly Lys He Arg 
2420 2425 2430 

cgc ate aag caa ate tct gtc acc etc cca get etc ttg ggt cca tac 7344 
Arg He Lys Gin He Ser Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr 
2435 2440 2445 

caa gat gtc caa gca ate etc tec tac ggt gac aag get ggt ttg gcg 7392 
Gin Asp Val Gin Ala He Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala 
2450 2455 2460 

aac ggt tgc gag get ctt get gtc tct cat ggc atg aac gac tct ggt 7440 
Asn Gly Cys Glu Ala Leu Ala Val Ser His Gly Met Asn Asp Ser Gly 
2465 2470 2475 

caa ttc caa ctt gac ttc aac gat ggc aag ttc etc cca ttc gag ggc 7488 
Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly 
2480 2485 2490 2495 

att gee att gac caa ggc acc etc acc etc tec ttc cca aac get tec 7536 
lie Ala He Asp Gin Gly Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser 
2500 2505 2510 

atg cca gag aag gga aag caa gee acc atg etc aag acc etc aac gat 7584 
Met Pro Glu Lys Gly Lys Gin Ala Thr Met Leu Lys Thr Leu Asn Asp 
2515 2520 2525 

ate ate etc cac ate agg tac acc ate aag tgagctc 7621 
He He Leu His He Arg Tyr Thr He Lys 
2530 2535 
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